Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G
The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.
Joyce Christopher J
Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.
Jolly Emmitt R
Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.
Norton, Gareth J.; Lou-Hing, Daniel E.; Meharg, Andrew A.; Price, Adam H.
Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 μM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the Bala×Azucena mapping population. PMID:18453530
Norton, Gareth J; Lou-Hing, Daniel E; Meharg, Andrew A; Price, Adam H
Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 muM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the BalaxAzucena mapping population.
Knippenberg, van I.C.
Tomato spotted wilt virus (TSWV) is the type species of the genus Tospovirus within the Bunyaviridae, a family of segmented negative strand RNA viruses. Although much ground has been covered in the past two decades, many questions concerning the mechanism of replication and transcription of this
Knippenberg, van I.C.
Tomato spotted wilt virus (TSWV) is the type species of the genus Tospovirus within the Bunyaviridae, a family of segmented negative strand RNA viruses. Although much ground has been covered in the past two decades, many questions concerning the mechanism of replication and transcription of this imp
Wen, Ji; Toomer, Kevin H.
Transcript variants play a critical role in diversifying gene expression. Alternative splicing is a major mechanism for generating transcript variants. A number of genes have been implicated in breast cancer pathogenesis with their aberrant expression of alternative transcripts. In this study, we performed genome-wide analyses of transcript variant expression in breast cancer. With RNA-Seq data from 105 patients, we characterized the transcriptome of breast tumors, by pairwise comparison of gene expression in the breast tumor versus matched healthy tissue from each patient. We identified 2839 genes, ~10 % of protein-coding genes in the human genome, that had differential expression of transcript variants between tumors and healthy tissues. The validity of the computational analysis was confirmed by quantitative RT-PCR assessment of transcript variant expression from four top candidate genes. The alternative transcript profiling led to classification of breast cancer into two subgroups and yielded a novel molecular signature that could be prognostic of patients’ tumor burden and survival. We uncovered nine splicing factors (FOX2, MBNL1, QKI, PTBP1, ELAVL1, HNRNPC, KHDRBS1, SFRS2, and TIAR) that were involved in aberrant splicing in breast cancer. Network analyses for the coordinative patterns of transcript variant expression identified twelve “hub” genes that differentiated the cancerous and normal transcriptomes. Dysregulated expression of alternative transcripts may reveal novel biomarkers for tumor development. It may also suggest new therapeutic targets, such as the “hub” genes identified through the network analyses of transcript variant expression, or splicing factors implicated in the formation of the tumor transcriptome. PMID:25913416
Full Text Available Malaria transmission in sub-Saharan Africa varies seasonally in intensity. Outbreaks of malaria occur after the beginning of the rainy season, whereas, during the dry season, reports of the disease are less frequent. Anopheles gambiae mosquitoes, the main malaria vector, are observed all year long but their densities are low during the dry season that generally lasts several months. Aestivation, seasonal migration, and local adaptation have been suggested as mechanisms that enable mosquito populations to persist through the dry season. Studies of chromosomal inversions have shown that inversions 2La, 2Rb, 2Rc, 2Rd, and 2Ru are associated with various physiological changes that confer aridity resistance. However, little is known about how phenotypic plasticity responds to seasonally dry conditions. This study examined the effects of desiccation stress on transcriptional regulation in An. gambiae. We exposed female An. gambiae G3 mosquitoes to acute desiccation and conducted a genome-wide analysis of their transcriptomes using the Affymetrix Plasmodium/Anopheles Genome Array. The transcription of 248 genes (1.7% of all transcripts was significantly affected in all experimental conditions, including 96 with increased expression and 152 with decreased expression. In general, the data indicate a reduction in the metabolic rate of mosquitoes exposed to desiccation. Transcripts accumulated at higher levels during desiccation are associated with oxygen radical detoxification, DNA repair and stress responses. The proportion of transcripts within 2La and 2Rs (2Rb, 2Rc, 2Rd, and 2Ru (67/248, or 27% is similar to the percentage of transcripts located within these inversions (31%. These data may be useful in efforts to elucidate the role of chromosomal inversions in aridity tolerance. The scope of application of the anopheline genome demonstrates that examining transcriptional activity in relation to genotypic adaptations greatly expands the number of
Xin, Haiping; Zhu, Wei; Wang, Lina; Xiang, Yue; Fang, Linchuan; Li, Jitao; Sun, Xiaoming; Wang, Nian; Londo, Jason P; Li, Shaohua
Grape is one of the most important fruit crops worldwide. The suitable geographical locations and productivity of grapes are largely limited by temperature. Vitis amurensis is a wild grapevine species with remarkable cold-tolerance, exceeding that of Vitis vinifera, the dominant cultivated species of grapevine. However, the molecular mechanisms that contribute to the enhanced freezing tolerance of V. amurensis remain unknown. Here we used deep sequencing data from restriction endonuclease-generated cDNA fragments to evaluate the whole genome wide modification of transcriptome of V. amurensis under cold treatment. Vitis vinifera cv. Muscat of Hamburg was used as control to help investigate the distinctive features of V. amruensis in responding to cold stress. Approximately 9 million tags were sequenced from non-cold treatment (NCT) and cold treatment (CT) cDNA libraries in each species of grapevine sampled from shoot apices. Alignment of tags into V. vinifera cv. Pinot noir (PN40024) annotated genome identified over 15,000 transcripts in each library in V. amruensis and more than 16,000 in Muscat of Hamburg. Comparative analysis between NCT and CT libraries indicate that V. amurensis has fewer differential expressed genes (DEGs, 1314 transcripts) than Muscat of Hamburg (2307 transcripts) when exposed to cold stress. Common DEGs (408 transcripts) suggest that some genes provide fundamental roles during cold stress in grapes. The most robust DEGs (more than 20-fold change) also demonstrated significant differences between two kinds of grapevine, indicating that cold stress may trigger species specific pathways in V. amurensis. Functional categories of DEGs indicated that the proportion of up-regulated transcripts related to metabolism, transport, signal transduction and transcription were more abundant in V. amurensis. Several highly expressed transcripts that were found uniquely accumulated in V. amurensis are discussed in detail. This subset of unique candidate
Eunsook Chung; Kyoung-Mi Kim; Jai-Heon Lee
Heat shock transcription factors (Hsfs) play an essential role on the increased tolerance against heat stress by regulating the expression of heat-responsive genes.In this study,a genome-wide analysis was performed to identify all of the soybean (Glycine max) GmHsfgenes based on the latest soybean genome sequence.Chromosomal location,protein domain,motif organization,and phylogenetic relationships of 26 non-redundant GmHsf genes were analyzed compared with AtHsfs (Arabidopsis thaliana Hsfs).According to their structural features,the predicted members were divided into the previously defined classes A-C,as described for AtHsfs.Transcript levels and subcellular localization of five GmHsfs responsive to abiotic stresses were analyzed by real-time RT-PCR.These results provide a fundamental clue for understanding the complexity of the soybean GmHsfgene family and cloning the functional genes in future studies.
Full Text Available WRKY transcription factors are a class of DNA-binding proteins that bind with a specific sequence C/TTGACT/C known as W-Box found in promoters of genes which are regulated by these WRKYs. From previous studies, 43 different stress responsive WRKY transcription factors in Arabidopsis thaliana, identified and then categorized in three groups viz., abiotic, biotic and both of these stresses. A comprehensive genome wide analysis including chromosomal localization, gene structure analysis, multiple sequence alignment, phylogenetic analysis and promoter analysis of these WRKY genes was carried out in this study to determine the functional homology in Arabidopsis. This analysis led to the classification of these WRKY family members into 3 major groups and subgroups and showed evolutionary relationship among these groups on the base of their functional WRKY domain, chromosomal localization and intron/exon structure. The proposed groups of these stress responsive WRKY genes and annotation based on their position on chromosomes can also be explored to determine their functional homology in other plant species in relation to different stresses. The result of the present study provides indispensable genomic information for the stress responsive WRKY transcription factors in Arabidopsis and will pave the way to explain the precise role of various AtWRKYs in plant growth and development under stressed conditions.
Pérez-Rueda, Ernesto; Janga, Sarath Chandra
Archaea, which represent a large fraction of the phylogenetic diversity of organisms, are prokaryotes with eukaryote-like basal transcriptional machinery. This organization makes the study of their DNA-binding transcription factors (TFs) and their transcriptional regulatory networks particularly interesting. In addition, there are limited experimental data regarding their TFs. In this work, 3,918 TFs were identified and exhaustively analyzed in 52 archaeal genomes. TFs represented less than 5% of the gene products in all the studied species comparable with the number of TFs identified in parasites or intracellular pathogenic bacteria, suggesting a deficit in this class of proteins. A total of 75 families were identified, of which HTH_3, AsnC, TrmB, and ArsR families were universally and abundantly identified in all the archaeal genomes. We found that archaeal TFs are significantly small compared with other protein-coding genes in archaea as well as bacterial TFs, suggesting that a large fraction of these small-sized TFs could supply the probable deficit of TFs in archaea, by possibly forming different combinations of monomers similar to that observed in eukaryotic transcriptional machinery. Our results show that although the DNA-binding domains of archaeal TFs are similar to bacteria, there is an underrepresentation of ligand-binding domains in smaller TFs, which suggests that protein-protein interactions may act as mediators of regulatory feedback, indicating a chimera of bacterial and eukaryotic TFs' functionality. The analysis presented here contributes to the understanding of the details of transcriptional apparatus in archaea and provides a framework for the analysis of regulatory networks in these organisms.
Westergaard, Steen Lund; Bro, Christoffer; Olsson, Lisbeth
The role of Grr1p in glucose sensing in Saccharomyces cerevisiae was elucidated through genome-wide transcription analysis. From triplicate analysis of a strain with deletion of the GRR1-gene from the genome and an isogenic reference strain, 68 genes were identified to have significantly altered...
Full Text Available Adherent-invasive Escherichia coli (AIEC strains are detected more frequently within mucosal lesions of patients with Crohn's disease (CD. The AIEC phenotype consists of adherence and invasion of intestinal epithelial cells and survival within macrophages of these bacteria in vitro. Our aim was to identify candidate transcripts that distinguish AIEC from non-invasive E. coli (NIEC strains and might be useful for rapid and accurate identification of AIEC by culture-independent technology. We performed comparative RNA-Sequence (RNASeq analysis using AIEC strain LF82 and NIEC strain HS during exponential and stationary growth. Differential expression analysis of coding sequences (CDS homologous to both strains demonstrated 224 and 241 genes with increased and decreased expression, respectively, in LF82 relative to HS. Transition metal transport and siderophore metabolism related pathway genes were up-regulated, while glycogen metabolic and oxidation-reduction related pathway genes were down-regulated, in LF82. Chemotaxis related transcripts were up-regulated in LF82 during the exponential phase, but flagellum-dependent motility pathway genes were down-regulated in LF82 during the stationary phase. CDS that mapped only to the LF82 genome accounted for 747 genes. We applied an in silico subtractive genomics approach to identify CDS specific to AIEC by incorporating the genomes of 10 other previously phenotyped NIEC. From this analysis, 166 CDS mapped to the LF82 genome and lacked homology to any of the 11 human NIEC strains. We compared these CDS across 13 AIEC, but none were homologous in each. Four LF82 gene loci belonging to clustered regularly interspaced short palindromic repeats region (CRISPR--CRISPR-associated (Cas genes were identified in 4 to 6 AIEC and absent from all non-pathogenic bacteria. As previously reported, AIEC strains were enriched for pdu operon genes. One CDS, encoding an excisionase, was shared by 9 AIEC strains. Reverse
de Santiago, Ines; Liu, Wei; Yuan, Ke; O'Reilly, Martin; Chilamakuri, Chandra Sekhar Reddy; Ponder, Bruce A J; Meyer, Kerstin B; Markowetz, Florian
Allele-specific measurements of transcription factor binding from ChIP-seq data are key to dissecting the allelic effects of non-coding variants and their contribution to phenotypic diversity. However, most methods of detecting an allelic imbalance assume diploid genomes. This assumption severely limits their applicability to cancer samples with frequent DNA copy-number changes. Here we present a Bayesian statistical approach called BaalChIP to correct for the effect of background allele frequency on the observed ChIP-seq read counts. BaalChIP allows the joint analysis of multiple ChIP-seq samples across a single variant and outperforms competing approaches in simulations. Using 548 ENCODE ChIP-seq and six targeted FAIRE-seq samples, we show that BaalChIP effectively corrects allele-specific analysis for copy-number variation and increases the power to detect putative cis-acting regulatory variants in cancer genomes.
Full Text Available Abstract Background Heat shock response in eukaryotes is transcriptionally regulated by conserved heat shock transcription factors (Hsfs. Hsf genes are represented by a large multigene family in plants and investigation of the Hsf gene family will serve to elucidate the mechanisms by which plants respond to stress. In recent years, reports of genome-wide structural and evolutionary analysis of the entire Hsf gene family have been generated in two model plant systems, Arabidopsis and rice. Maize, an important cereal crop, has represented a model plant for genetics and evolutionary research. Although some Hsf genes have been characterized in maize, analysis of the entire Hsf gene family were not completed following Maize (B73 Genome Sequencing Project. Results A genome-wide analysis was carried out in the present study to identify all Hsfs maize genes. Due to the availability of complete maize genome sequences, 25 nonredundant Hsf genes, named ZmHsfs were identified. Chromosomal location, protein domain and motif organization of ZmHsfs were analyzed in maize genome. The phylogenetic relationships, gene duplications and expression profiles of ZmHsf genes were also presented in this study. Twenty-five ZmHsfs were classified into three major classes (class A, B, and C according to their structural characteristics and phylogenetic comparisons, and class A was further subdivided into 10 subclasses. Moreover, phylogenetic analysis indicated that the orthologs from the three species (maize, Arabidopsis and rice were distributed in all three classes, it also revealed diverse Hsf gene family expression patterns in classes and subclasses. Chromosomal/segmental duplications played a key role in Hsf gene family expansion in maize by investigation of gene duplication events. Furthermore, the transcripts of 25 ZmHsf genes were detected in the leaves by heat shock using quantitative real-time PCR. The result demonstrated that ZmHsf genes exhibit different
Wexler, Eric M; Rosen, Ezra; Lu, Daning; Osborn, Gregory E; Martin, Elizabeth; Raybould, Helen; Geschwind, Daniel H
Wnt proteins are critical to mammalian brain development and function. The canonical Wnt signaling pathway involves the stabilization and nuclear translocation of β-catenin; however, Wnt also signals through alternative, noncanonical pathways. To gain a systems-level, genome-wide view of Wnt signaling, we analyzed Wnt1-stimulated changes in gene expression by transcriptional microarray analysis in cultured human neural progenitor (hNP) cells at multiple time points over a 72-hour time course. We observed a widespread oscillatory-like pattern of changes in gene expression, involving components of both the canonical and the noncanonical Wnt signaling pathways. A higher-order, systems-level analysis that combined independent component analysis, waveform analysis, and mutual information-based network construction revealed effects on pathways related to cell death and neurodegenerative disease. Wnt effectors were tightly clustered with presenilin1 (PSEN1) and granulin (GRN), which cause dominantly inherited forms of Alzheimer's disease and frontotemporal dementia (FTD), respectively. We further explored a potential link between Wnt1 and GRN and found that Wnt1 decreased GRN expression by hNPs. Conversely, GRN knockdown increased WNT1 expression, demonstrating that Wnt and GRN reciprocally regulate each other. Finally, we provided in vivo validation of the in vitro findings by analyzing gene expression data from individuals with FTD. These unbiased and genome-wide analyses provide evidence for a connection between Wnt signaling and the transcriptional regulation of neurodegenerative disease genes.
Full Text Available Abstract Background The MYB superfamily constitutes one of the most abundant groups of transcription factors described in plants. Nevertheless, their functions appear to be highly diverse and remain rather unclear. To date, no genome-wide characterization of this gene family has been conducted in a legume species. Here we report the first genome-wide analysis of the whole MYB superfamily in a legume species, soybean (Glycine max, including the gene structures, phylogeny, chromosome locations, conserved motifs, and expression patterns, as well as a comparative genomic analysis with Arabidopsis. Results A total of 244 R2R3-MYB genes were identified and further classified into 48 subfamilies based on a phylogenetic comparative analysis with their putative orthologs, showed both gene loss and duplication events. The phylogenetic analysis showed that most characterized MYB genes with similar functions are clustered in the same subfamily, together with the identification of orthologs by synteny analysis, functional conservation among subgroups of MYB genes was strongly indicated. The phylogenetic relationships of each subgroup of MYB genes were well supported by the highly conserved intron/exon structures and motifs outside the MYB domain. Synonymous nucleotide substitution (dN/dS analysis showed that the soybean MYB DNA-binding domain is under strong negative selection. The chromosome distribution pattern strongly indicated that genome-wide segmental and tandem duplication contribute to the expansion of soybean MYB genes. In addition, we found that ~ 4% of soybean R2R3-MYB genes had undergone alternative splicing events, producing a variety of transcripts from a single gene, which illustrated the extremely high complexity of transcriptome regulation. Comparative expression profile analysis of R2R3-MYB genes in soybean and Arabidopsis revealed that MYB genes play conserved and various roles in plants, which is indicative of a divergence in
Xiaofeng Cai; Yuyang Zhang; Chanjuan Zhang; Tingyan Zhang; Tixu Hu; Jie Ye; Junhong Zhang
The Dof (DNA binding with One Finger) family encoding single zinc finger proteins has been known as a family of plant-specific transcription factors.These transcription factors are involved in a variety of functions of importance for different biological processes in plants.In the current study,we identified 34 Dof family genes in tomato (Solanum lycopersicum L.),distributed on 11 chromosomes.A complete overview of SIDof genes in tomato is presented,including the gene structures,chromosome locations,phylogeny,protein motifs and evolution pattern.Phylogenetic analysis of 34 SlDof proteins resulted in four classes constituting six clusters.In addition,a comparative analysis between these genes in tomato,Arabidopsis (Arabidopsis thaliana L.) and rice (Oryza sativa L.) was also performed.The tomato Dof family expansion has been dated to recent duplication events,and segmental duplication is predominant for the SlDof genes.Furthermore,the SlDof genes displayed differential expression either in their transcript abundance or in their expression patterns under normal growth conditions.This is the first step towards genome-wide analyses of the Dof genes in tomato.Our study provides a very useful reference for cloning and functional analysis of the members of this gene family in tomato and other species.
Full Text Available Non-coding RNA, including small interfering RNAs (siRNAs, are important components of gene expression in eukaryotes, forming a regulatory network. miRNAs are expressed through nucleolytic maturation of hairpin precursors transcribed by RNA Polymerase II or III. Such transcripts are involved in post-transcriptional gene regulation in plants, fungi and animals. miRNAs bind to target RNA transcripts and guide their cleavage (mostly for plants or act to prevent translation. siRNAs act via a similar mechanism of cleavage of their target genes, but they also can direct genomic DNA methylation and chromatin remodeling. It is estimated that large fraction, up to 30% of all human genes also may be post-transcriptionally regulated by miRNAs. For plant genomes numbers could be higher depending on quality of sequencing and genome annotation. Due to availability of genome and mRNA sequences genome-wide searches for sense-antisense transcripts have been reported, but few plant sense-antisense transcript pairs have been studied. Integration of these data in specialized databases is challenging problem of computer genomics. We have developed set of computer programs to define antisense transcripts and miRNA genes based on available sequencing data. We have analyzed data from PlantNATsDB (Plant Natural Antisense Transcripts DataBase which is a platform for annotating and discovering Natural Antisense Transcripts (NAT by integrating various data sources . NATs can be grouped into two categories, cis-NATs and trans-NATs. Cis-NAT pairs are transcribed from opposing DNA strands at the same genomic locus and have a variety of orientations and differing lengths of overlap between the perfect sequence complementary regions, whereas trans-NAT pairs are transcribed from different loci and form partial complementarily. The database contains at the moment 69 plant species. The database provides an integrative, interactive and information-rich web graphical interface to
Ma, Jun; Wang, Qinglian; Sun, Runrun; Xie, Fuliang; Jones, Don C; Zhang, Baohong
Plant-specific TEOSINTE-BRANCHED1/CYCLOIDEA/PCF (TCP) transcription factors play versatile functions in multiple aspects of plant growth and development. However, no systematical study has been performed in cotton. In this study, we performed for the first time the genome-wide identification and expression analysis of the TCP transcription factor family in Gossypium raimondii. A total of 38 non-redundant cotton TCP encoding genes were identified. The TCP transcription factors were divided into eleven subgroups based on phylogenetic analysis. Most TCP genes within the same subfamily demonstrated similar exon and intron organization and the motif structures were highly conserved among the subfamilies. Additionally, the chromosomal distribution pattern revealed that TCP genes were unevenly distributed across 11 out of the 13 chromosomes; segmental duplication is a predominant duplication event for TCP genes and the major contributor to the expansion of TCP gene family in G. raimondii. Moreover, the expression profiles of TCP genes shed light on their functional divergence.
Tavenet, Arounie; Suleau, Audrey; Dubreuil, Géraldine; Ferrari, Roberto; Ducrot, Cécile; Michaut, Magali; Aude, Jean-Christophe; Dieci, Giorgio; Lefebvre, Olivier; Conesa, Christine; Acker, Joël
Human PC4 and the yeast ortholog Sub1 have multiple functions in RNA polymerase II transcription. Genome-wide mapping revealed that Sub1 is present on Pol III-transcribed genes. Sub1 was found to interact with components of the Pol III transcription system and to stimulate the initiation and reinitiation steps in a system reconstituted with all recombinant factors. Sub1 was required for optimal Pol III gene transcription in exponentially growing cells. PMID:19706510
Full Text Available Abstract Background A complete understanding of the regulatory mechanisms of gene expression is the next important issue of genomics. Many bioinformaticians have developed methods and algorithms for predicting transcriptional regulatory mechanisms from sequence, gene expression, and binding data. However, most of these studies involved the use of yeast which has much simpler regulatory networks than human and has many genome wide binding data and gene expression data under diverse conditions. Studies of genome wide transcriptional networks of human genomes currently lag behind those of yeast. Results We report herein a new method that combines gene expression data analysis with promoter analysis to infer transcriptional regulatory elements of human genes. The Z scores from the application of gene set analysis with gene sets of transcription factor binding sites (TFBSs were successfully used to represent the activity of TFBSs in a given microarray data set. A significant correlation between the Z scores of gene sets of TFBSs and individual genes across multiple conditions permitted successful identification of many known human transcriptional regulatory elements of genes as well as the prediction of numerous putative TFBSs of many genes which will constitute a good starting point for further experiments. Using Z scores of gene sets of TFBSs produced better predictions than the use of mRNA levels of a transcription factor itself, suggesting that the Z scores of gene sets of TFBSs better represent diverse mechanisms for changing the activity of transcription factors in the cell. In addition, cis-regulatory modules, combinations of co-acting TFBSs, were readily identified by our analysis. Conclusion By a strategic combination of gene set level analysis of gene expression data sets and promoter analysis, we were able to identify and predict many transcriptional regulatory elements of human genes. We conclude that this approach will aid in decoding
Full Text Available Abstract Background The MYB gene family comprises one of the richest groups of transcription factors in plants. Plant MYB proteins are characterized by a highly conserved MYB DNA-binding domain. MYB proteins are classified into four major groups namely, 1R-MYB, 2R-MYB, 3R-MYB and 4R-MYB based on the number and position of MYB repeats. MYB transcription factors are involved in plant development, secondary metabolism, hormone signal transduction, disease resistance and abiotic stress tolerance. A comparative analysis of MYB family genes in rice and Arabidopsis will help reveal the evolution and function of MYB genes in plants. Results A genome-wide analysis identified at least 155 and 197 MYB genes in rice and Arabidopsis, respectively. Gene structure analysis revealed that MYB family genes possess relatively more number of introns in the middle as compared with C- and N-terminal regions of the predicted genes. Intronless MYB-genes are highly conserved both in rice and Arabidopsis. MYB genes encoding R2R3 repeat MYB proteins retained conserved gene structure with three exons and two introns, whereas genes encoding R1R2R3 repeat containing proteins consist of six exons and five introns. The splicing pattern is similar among R1R2R3 MYB genes in Arabidopsis. In contrast, variation in splicing pattern was observed among R1R2R3 MYB members of rice. Consensus motif analysis of 1kb upstream region (5′ to translation initiation codon of MYB gene ORFs led to the identification of conserved and over-represented cis-motifs in both rice and Arabidopsis. Real-time quantitative RT-PCR analysis showed that several members of MYBs are up-regulated by various abiotic stresses both in rice and Arabidopsis. Conclusion A comprehensive genome-wide analysis of chromosomal distribution, tandem repeats and phylogenetic relationship of MYB family genes in rice and Arabidopsis suggested their evolution via duplication. Genome-wide comparative analysis of MYB genes and
Full Text Available The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp. and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci.
Nikolaichik, Yevgeny; Damienikan, Aliaksandr U
The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.
Sharov, Alexei A; Dudekula, Dawood B.; Minoru S.H. Ko
To build a mouse gene index with the most comprehensive coverage of alternative transcription/splicing (ATS), we developed an algorithm and a fully automated computational pipeline for transcript assembly from expressed sequences aligned to the genome. We identified 191,946 genomic loci, which included 27,497 protein-coding genes and 11,906 additional gene candidates (e.g., nonprotein-coding, but multiexon). Comparison of the resulting gene index with TIGR, UniGene, DoTS, and ESTGenes databas...
Settles Matthew L
Full Text Available Abstract Background Natural antisense transcripts (NATs are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded or a different locus (trans-encoded. They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation. NATs give rise to sense-antisense transcript pairs and the number of these identified has escalated greatly with the availability of DNA sequencing resources and public databases. Traditionally, NATs were identified by the alignment of full-length cDNAs or expressed sequence tags to genome sequences, but an alternative method for large-scale detection of sense-antisense transcript pairs involves the use of microarrays. In this study we developed a novel protocol to assay sense- and antisense-strand transcription on the 55 K Affymetrix GeneChip Wheat Genome Array, which is a 3' in vitro transcription (3'IVT expression array. We selected five different tissue types for assay to enable maximum discovery, and used the 'Chinese Spring' wheat genotype because most of the wheat GeneChip probe sequences were based on its genomic sequence. This study is the first report of using a 3'IVT expression array to discover the expression of natural sense-antisense transcript pairs, and may be considered as proof-of-concept. Results By using alternative target preparation schemes, both the sense- and antisense-strand derived transcripts were labeled and hybridized to the Wheat GeneChip. Quality assurance verified that successful hybridization did occur in the antisense-strand assay. A stringent threshold for positive hybridization was applied, which resulted in the identification of 110 sense-antisense transcript pairs, as well as 80 potentially antisense-specific transcripts. Strand-specific RT-PCR validated the microarray observations, and showed that antisense transcription is likely to be tissue specific. For the annotated sense
Sharov, Alexei A.; Dudekula, Dawood B.; Ko, Minoru S.H.
To build a mouse gene index with the most comprehensive coverage of alternative transcription/splicing (ATS), we developed an algorithm and a fully automated computational pipeline for transcript assembly from expressed sequences aligned to the genome. We identified 191,946 genomic loci, which included 27,497 protein-coding genes and 11,906 additional gene candidates (e.g., nonprotein-coding, but multiexon). Comparison of the resulting gene index with TIGR, UniGene, DoTS, and ESTGenes databases revealed that it had a greater number of transcripts, a greater average number of exons and introns with proper splicing sites per gene, and longer ORFs. The 27,497 protein-coding genes had 77,138 transcripts, i.e., 2.8 transcripts per gene on average. Close examination of transcripts led to a combinatorial table of 23 types of ATS units, only nine of which were previously described, i.e., 14 types of alternative splicing, seven types of alternative starts, and two types of alternative termination. The 47%, 18%, and 14% of 20,323 multiexon protein-coding genes with proper splice sites had alternative splicings, alternative starts, and alternative terminations, respectively. The gene index with the comprehensive ATS will provide a useful platform for analyzing the nature and mechanism of ATS, as well as for designing the accurate exon-based DNA microarrays. PMID:15867436
McCabe, Colleen D; Spyropoulos, Demetri D; Martin, David; Moreno, Carlos S
Homeobox transcription factors are developmentally regulated genes that play crucial roles in tissue patterning. Homeobox C6 (HOXC6) is overexpressed in prostate cancers and correlated with cancer progression, but the downstream targets of HOXC6 are largely unknown. We have performed genome-wide localization analysis to identify promoters bound by HOXC6 in prostate cancer cells. This analysis identified 468 reproducibly bound promoters whose associated genes are involved in functions such as cell proliferation and apoptosis. We have complemented these data with expression profiling of prostates from mice with homozygous disruption of the Hoxc6 gene to identify 31 direct regulatory target genes of HOXC6. We show that HOXC6 directly regulates expression of bone morphogenic protein 7, fibroblast growth factor receptor 2, insulin-like growth factor binding protein 3, and platelet-derived growth factor receptor alpha (PDGFRA) in prostate cells and indirectly influences the Notch and Wnt signaling pathways in vivo. We further show that inhibition of PDGFRA reduces proliferation of prostate cancer cells, and that overexpression of HOXC6 can overcome the effects of PDGFRA inhibition. HOXC6 regulates genes with both oncogenic and tumor suppressor activities as well as several genes such as CD44 that are important for prostate branching morphogenesis and metastasis to the bone microenvironment.
Burt, D W; Dey, B R; Paton, I R; Morrice, D R; Law, A S
In this paper, we report the isolation, characterization, and mapping of the chicken transforming growth factor-beta 3 (TGF-beta 3) gene. The gene contains seven exons and six introns spanning 16-kb of the chicken genome. A comparison of the 5'-flanking regions of human and chicken TGF-beta 3 genes reveals two regions of sequence conservation. The first contains ATF/CRE and TBP/TATA sequence motifs within an 87-bp region. The second is a 162-bp region with no known sequence motifs. Identification of transcription start sites using chicken RNA isolated from various embryonic and adult tissues reveals two sites of initiation, P1 and P2, which map to these two conserved regions. Comparison of 3'-flanking regions of chicken and mammalian TGF-beta 3 genes also revealed conserved sequences. The most significant homologies were found in the 3'-most end of the transcribed region. DNA sequence analysis of chicken TGF-beta 3 cDNAs isolated by 3'-RACE revealed multiple polyadenylation sites unusually distant from a poly(A) signal motif. A Msc I restriction fragment length polymorphism (RFLP) marker was used to map the TGFB3 locus to linkage group E7 on the East Lansing reference backcross. Linkage to the TH locus showed that the TGFB3 locus was physically located on chicken chromosome 5.
Wilson, Nicola K; Schoenfelder, Stefan; Hannah, Rebecca; Sánchez Castillo, Manuel; Schütte, Judith; Ladopoulos, Vasileios; Mitchelmore, Joanna; Goode, Debbie K; Calero-Nieto, Fernando J; Moignard, Victoria; Wilkinson, Adam C; Jimenez-Madrid, Isabel; Kinston, Sarah; Spivakov, Mikhail; Fraser, Peter; Göttgens, Berthold
Comprehensive study of transcriptional control processes will be required to enhance our understanding of both normal and malignant hematopoiesis. Modern sequencing technologies have revolutionized our ability to generate genome-scale expression and histone modification profiles, transcription factor (TF)-binding maps, and also comprehensive chromatin-looping information. Many of these technologies, however, require large numbers of cells, and therefore cannot be applied to rare hematopoietic stem/progenitor cell (HSPC) populations. The stem cell factor-dependent multipotent progenitor cell line HPC-7 represents a well-recognized cell line model for HSPCs. Here we report genome-wide maps for 17 TFs, 3 histone modifications, DNase I hypersensitive sites, and high-resolution promoter-enhancer interactomes in HPC-7 cells. Integrated analysis of these complementary data sets revealed TF occupancy patterns of genomic regions involved in promoter-anchored loops. Moreover, preferential associations between pairs of TFs bound at either ends of chromatin loops led to the identification of 4 previously unrecognized protein-protein interactions between key blood stem cell regulators. All HPC-7 data sets are freely available both through standard repositories and a user-friendly Web interface. Together with previously generated genome-wide data sets, this study integrates HPC-7 data into a genomic resource on par with ENCODE tier 1 cell lines and, importantly, is the only current model with comprehensive genome-scale data that is relevant to HSPC biology. © 2016 by The American Society of Hematology.
Alexander M van der Linden
Full Text Available Most organisms have an endogenous circadian clock that is synchronized to environmental signals such as light and temperature. Although circadian rhythms have been described in the nematode Caenorhabditis elegans at the behavioral level, these rhythms appear to be relatively non-robust. Moreover, in contrast to other animal models, no circadian transcriptional rhythms have been identified. Thus, whether this organism contains a bona fide circadian clock remains an open question. Here we use genome-wide expression profiling experiments to identify light- and temperature-entrained oscillating transcripts in C. elegans. These transcripts exhibit rhythmic expression with temperature-compensated 24-h periods. In addition, their expression is sustained under constant conditions, suggesting that they are under circadian regulation. Light and temperature cycles strongly drive gene expression and appear to entrain largely nonoverlapping gene sets. We show that mutations in a cyclic nucleotide-gated channel required for sensory transduction abolish both light- and temperature-entrained gene expression, implying that environmental cues act cell nonautonomously to entrain circadian rhythms. Together, these findings demonstrate circadian-regulated transcriptional rhythms in C. elegans and suggest that further analyses in this organism will provide new information about the evolution and function of this biological clock.
Wang, Zhipeng; Zhang, Qin
The ETS proteins are a family of transcription factors (TFs) that regulate a variety of biological processes. We made genome-wide analyses to explore the classification of the ETS gene family. We identified 207 ETS genes which encode 321 ETS TFs from ten animal species. Of the 321 ETS TFs, 155 contain only an ETS domain, about 50% contain a ETS_PEA3_N or a SAM_PNT domain in addition to an ETS domain, the rest (only four) contain a second ETS domain or a second ETS_PEA3_N domain or an another ...
Hu, Ping; Brodie, Eoin L.; Suzuki, Yohey; McAdams, Harley H.; Andersen, Gary L.
The bacterium Caulobacter crescentus and related stalkbacterial species are known for their distinctive ability to live in lownutrient environments, a characteristic of most heavy metal contaminatedsites. Caulobacter crescentus is a model organism for studying cell cycleregulation with well developed genetics. We have identified the pathwaysresponding to heavy metal toxicity in C. crescentus to provide insightsfor possible application of Caulobacter to environmental restoration. Weexposed C. crescentus cells to four heavy metals (chromium, cadmium,selenium and uranium) and analyzed genome wide transcriptional activitiespost exposure using a Affymetrix GeneChip microarray. C. crescentusshowed surprisingly high tolerance to uranium, a possible mechanism forwhich may be formation of extracellular calcium-uranium-phosphateprecipitates. The principal response to these metals was protectionagainst oxidative stress (up-regulation of manganese-dependent superoxidedismutase, sodA). Glutathione S-transferase, thioredoxin, glutaredoxinsand DNA repair enzymes responded most strongly to cadmium and chromate.The cadmium and chromium stress response also focused on reducing theintracellular metal concentration, with multiple efflux pumps employed toremove cadmium while a sulfate transporter was down-regulated to reducenon-specific uptake of chromium. Membrane proteins were also up-regulatedin response to most of the metals tested. A two-component signaltransduction system involved in the uranium response was identified.Several differentially regulated transcripts from regions previously notknown to encode proteins were identified, demonstrating the advantage ofevaluating the transcriptome using whole genome microarrays.
Pérez-Lluch, Sílvia; Blanco, Enrique; Carbonell, Albert; Raha, Debasish; Snyder, Michael; Serras, Florenci; Corominas, Montserrat
An important mechanism for gene regulation involves chromatin changes via histone modification. One such modification is histone H3 lysine 4 trimethylation (H3K4me3), which requires histone methyltranferase complexes (HMT) containing the trithorax-group (trxG) protein ASH2. Mutations in ash2 cause a variety of pattern formation defects in the Drosophila wing. We have identified genome-wide binding of ASH2 in wing imaginal discs using chromatin immunoprecipitation combined with sequencing (ChIP-Seq). Our results show that genes with functions in development and transcriptional regulation are activated by ASH2 via H3K4 trimethylation in nearby nucleosomes. We have characterized the occupancy of phosphorylated forms of RNA Polymerase II and histone marks associated with activation and repression of transcription. ASH2 occupancy correlates with phosphorylated forms of RNA Polymerase II and histone activating marks in expressed genes. Additionally, RNA Polymerase II phosphorylation on serine 5 and H3K4me3 are reduced in ash2 mutants in comparison to wild-type flies. Finally, we have identified specific motifs associated with ASH2 binding in genes that are differentially expressed in ash2 mutants. Our data suggest that recruitment of the ASH2-containing HMT complexes is context specific and points to a function of ASH2 and H3K4me3 in transcriptional pausing control.
Li, Meng-Yao; Xu, Zhi-Sheng; Tian, Chang; Huang, Ying; Wang, Feng; Xiong, Ai-Sheng
WRKY transcription factors belong to one of the largest transcription factor families. These factors possess functions in plant growth and development, signal transduction, and stress response. Here, we identified 95 DcWRKY genes in carrot based on the carrot genomic and transcriptomic data, and divided them into three groups. Phylogenetic analysis of WRKY proteins from carrot and Arabidopsis divided these proteins into seven subgroups. To elucidate the evolution and distribution of WRKY transcription factors in different species, we constructed a schematic of the phylogenetic tree and compared the WRKY family factors among 22 species, which including plants, slime mold and protozoan. An in-depth study was performed to clarify the homologous factor groups of nine divergent taxa in lower and higher plants. Based on the orthologous factors between carrot and Arabidopsis, 38 DcWRKY proteins were calculated to interact with other proteins in the carrot genome. Yeast two-hybrid assay showed that DcWRKY20 can interact with DcMAPK1 and DcMAPK4. The expression patterns of the selected DcWRKY genes based on transcriptome data and qRT-PCR suggested that those selected DcWRKY genes are involved in root development, biotic and abiotic stress response. This comprehensive analysis provides a basis for investigating the evolution and function of WRKY genes.
Feldmann, Radmila; Fischer, Cornelius; Kodelja, Vitam; Behrens, Sarah; Haas, Stefan; Vingron, Martin; Timmermann, Bernd; Geikowski, Anne; Sauer, Sascha
Increased physiological levels of oxysterols are major risk factors for developing atherosclerosis and cardiovascular disease. Lipid-loaded macrophages, termed foam cells, are important during the early development of atherosclerotic plaques. To pursue the hypothesis that ligand-based modulation of the nuclear receptor LXRα is crucial for cell homeostasis during atherosclerotic processes, we analysed genome-wide the action of LXRα in foam cells and macrophages. By integrating chromatin immunoprecipitation-sequencing (ChIP-seq) and gene expression profile analyses, we generated a highly stringent set of 186 LXRα target genes. Treatment with the nanomolar-binding ligand T0901317 and subsequent auto-regulatory LXRα activation resulted in sequence-dependent sharpening of the genome-binding patterns of LXRα. LXRα-binding loci that correlated with differential gene expression revealed 32 novel target genes with potential beneficial effects, which in part explained the implications of disease-associated genetic variation data. These observations identified highly integrated LXRα ligand-dependent transcriptional networks, including the APOE/C1/C4/C2-gene cluster, which contribute to the reversal of cholesterol efflux and the dampening of inflammation processes in foam cells to prevent atherogenesis.
Liu, Chaoyang; Wang, Xia; Xu, Yuantao; Deng, Xiuxin; Xu, Qiang
MYB transcription factor represents one of the largest gene families in plant genomes. Sweet orange (Citrus sinensis) is one of the most important fruit crops worldwide, and recently the genome has been sequenced. This provides an opportunity to investigate the organization and evolutionary characteristics of sweet orange MYB genes from whole genome view. In the present study, we identified 100 R2R3-MYB genes in the sweet orange genome. A comprehensive analysis of this gene family was performed, including the phylogeny, gene structure, chromosomal localization and expression pattern analyses. The 100 genes were divided into 29 subfamilies based on the sequence similarity and phylogeny, and the classification was also well supported by the highly conserved exon/intron structures and motif composition. The phylogenomic comparison of MYB gene family among sweet orange and related plant species, Arabidopsis, cacao and papaya suggested the existence of functional divergence during evolution. Expression profiling indicated that sweet orange R2R3-MYB genes exhibited distinct temporal and spatial expression patterns. Our analysis suggested that the sweet orange MYB genes may play important roles in different plant biological processes, some of which may be potentially involved in citrus fruit quality. These results will be useful for future functional analysis of the MYB gene family in sweet orange.
Full Text Available DNA methylation plays a central role in regulating many aspects of growth and development in mammals through regulating gene expression. The development of next generation sequencing technologies have paved the way for genome-wide, high resolution analysis of DNA methylation landscapes using methodology known as reduced representation bisulfite sequencing (RRBS. While RRBS has proven to be effective in understanding DNA methylation landscapes in humans, mice, and rats, to date, few studies have utilised this powerful method for investigating DNA methylation in agricultural animals. Here we describe the utilisation of RRBS to investigate DNA methylation in sheep Longissimus dorsi muscles. RRBS analysis of ∼1% of the genome from Longissimus dorsi muscles provided data of suitably high precision and accuracy for DNA methylation analysis, at all levels of resolution from genome-wide to individual nucleotides. Combining RRBS data with mRNAseq data allowed the sheep Longissimus dorsi muscle methylome to be compared with methylomes from other species. While some species differences were identified, many similarities were observed between DNA methylation patterns in sheep and other more commonly studied species. The RRBS data presented here highlights the complexity of epigenetic regulation of genes. However, the similarities observed across species are promising, in that knowledge gained from epigenetic studies in human and mice may be applied, with caution, to agricultural species. The ability to accurately measure DNA methylation in agricultural animals will contribute an additional layer of information to the genetic analyses currently being used to maximise production gains in these species.
Martí-Arbona, Ricardo; Mu, Fangping; Nowak-Lovato, Kristy L; Wren, Melinda S; Unkefer, Clifford J; Unkefer, Pat J
The clustering of genes in a pathway and the co-location of functionally related genes is widely recognized in prokaryotes. We used these characteristics to predict the metabolic involvement for a Transcriptional Regulator (TR) of unknown function, identified and confirmed its biological activity. A software tool that identifies the genes encoded within a defined genomic neighborhood for the subject TR and its homologs was developed. The output lists of genes in the genetic neighborhoods, their annotated functions, the reactants/products, and identifies the metabolic pathway in which the encoded-proteins function. When a set of TRs of known function was analyzed, we observed that their homologs frequently had conserved genomic neighborhoods that co-located the metabolically related genes regulated by the subject TR. We postulate that TR effectors are metabolites in the identified pathways; indeed the known effectors were present. We analyzed Bxe_B3018 from Burkholderia xenovorans, a TR of unknown function and predicted that this TR was related to the glycine, threonine and serine degradation. We tested the binding of metabolites in these pathways and for those that bound, their ability to modulate TR binding to its specific DNA operator sequence. Using rtPCR, we confirmed that methylglyoxal was an effector of Bxe_3018. These studies provide the proof of concept and validation of a systematic approach to the discovery of the biological activity for proteins of unknown function, in this case a TR. Bxe_B3018 is a methylglyoxal responsive TR that controls the expression of an operon composed of a putative efflux system.
Nefedova, L N; Kuz'min, I V; Burmistrova, D A; Rezazadekh, S; Kim, A I
In the present work, we studied the Grp gene (CG4680, Gag related protein) expression at the transcriptional level. It was found that at the embryonic and larval stages of D. melanogaster development the Grp expression proceeds at a low level, but it significantly increases at the adult stage. Adult individuals display a tissue-specific expression: an eleveated level of transcription is observed in the gut tissues, but not in the chitin carcass, head, and gonads. Since the gut may potentially be a primary barrier for the penetration of a viral infection, we conducted a comparative analysis of Grp gene transcription in D. melanogaster strains differing in the presence of active copies of the gypsy errantivirus and in the status of the flamenco gene controlling sensitivity to errantiviral infections. No noticeable differences in the level of Grp gene transcription were revealed. Thus, the Grp gene is not a pseudogene, but it is a functional gene of the D. melanogaster genome whose role remains to be elucidated.
Full Text Available Abstract Background Differential expression of genes can be regulated on many different levels. Most global studies of gene regulation concentrate on transcript level regulation, and very few global analyses of differential translational efficiencies exist. The studies have revealed that in Saccharomyces cerevisiae, Arabidopsis thaliana, and human cell lines translational regulation plays a significant role. Additional species have not been investigated yet. Particularly, until now no global study of translational control with any prokaryotic species was available. Results A global analysis of translational control was performed with two haloarchaeal model species, Halobacterium salinarum and Haloferax volcanii. To identify differentially regulated genes, exponentially growing and stationary phase cells were compared. More than 20% of H. salinarum transcripts are translated with non-average efficiencies. By far the largest group is comprised of genes that are translated with above-average efficiency specifically in exponential phase, including genes for many ribosomal proteins, RNA polymerase subunits, enzymes, and chemotaxis proteins. Translation of 1% of all genes is specifically repressed in either of the two growth phases. For comparison, DNA microarrays were also used to identify differential transcriptional regulation in H. salinarum, and 17% of all genes were found to have non-average transcript levels in exponential versus stationary phase. In H. volcanii, 12% of all genes are translated with non-average efficiencies. The overlap with H. salinarum is negligible. In contrast to H. salinarum, 4.6% of genes have non-average translational efficiency in both growth phases, and thus they might be regulated by other stimuli than growth phase. Conclusion For the first time in any prokaryotic species it was shown that a significant fraction of genes is under differential translational control. Groups of genes with different regulatory patterns
JI Qian; ZHANG Liang-sheng; WANG Yi-fei; WANG Jian
The basic leucine zipper (bZIP) transcription factors form a large gene family that is important in pathogen defense, light and stress signaling, etc. The Completed whole genome sequences of model plants Arabidopsis (Arabidopsis thaliana), rice (Oryza saliva) and poplar (Populus trichocarpa) constitute a valuable resource for genome-wide analysis and genomic comparative analysis, as they are representatives of the two major evolutionary lineages within the angiosperms: the monocotyledons and the dicotyledons. In this study, bioinformatics analysis identified 74, 89 and 88 bZIP genes respectively in Arabidopsis, rice and poplar. Moreover, a comprehensive overview of this gene family is presented, including the gene structure, phylogeny, chromosome distribution, conserved motifs. As a result, the plant bZIPs were organized into 10 subfamilies on basis of phylogenetic relationship. Gene duplication events during the family evolution history were also investigated. And it was further concluded that chromosomal/segmental duplication might have played a key role in gene expansion of bZIP gene family.
Mercier, Eloi; Droit, Arnaud; Li, Leping; Robertson, Gordon; Zhang, Xuekui; Gottardo, Raphael
ChIP-Seq has become the standard method for genome-wide profiling DNA association of transcription factors. To simplify analyzing and interpreting ChIP-Seq data, which typically involves using multiple applications, we describe an integrated, open source, R-based analysis pipeline. The pipeline addresses data input, peak detection, sequence and motif analysis, visualization, and data export, and can readily be extended via other R and Bioconductor packages. Using a standard multicore computer, it can be used with datasets consisting of tens of thousands of enriched regions. We demonstrate its effectiveness on published human ChIP-Seq datasets for FOXA1, ER, CTCF and STAT1, where it detected co-occurring motifs that were consistent with the literature but not detected by other methods. Our pipeline provides the first complete set of Bioconductor tools for sequence and motif analysis of ChIP-Seq and ChIP-chip data.
Ecker, Simone; Chen, Lu; Pancaldi, Vera
Background: A healthy immune system requires immune cells that adapt rapidly to environmental challenges. This phenotypic plasticity can be mediated by transcriptional and epigenetic variability. Results: We apply a novel analytical approach to measure and compare transcriptional and epigenetic v...
Dong, Chen; Hu, Huigang; Xie, Jianghui
DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.
Patrícia Aline Gröhs Ferrareze
Full Text Available Cryptococcus gattii is a human and animal pathogen that infects healthy hosts and caused the Pacific Northwest outbreak of cryptococcosis. The inhalation of infectious propagules can lead to internalization of cryptococcal cells by alveolar macrophages, a niche in which C. gattii cells can survive and proliferate. Although the nutrient composition of macrophages is relatively unknown, the high induction of amino acid transporter genes inside the phagosome indicates a preference for amino acid uptake instead of synthesis. However, the presence of countable errors in the R265 genome annotation indicates significant inhibition of transcriptomic analysis in this hypervirulent strain. Thus, we analyzed RNA-Seq data from in vivo and in vitro cultures of C. gattii R265 to perform the reannotation of the genome. In addition, based on in vivo transcriptomic data, we identified highly expressed genes and pathways of amino acid metabolism that would enable C. gattii to survive and proliferate in vivo. Importantly, we identified high expression in three APC amino acid transporters as well as the GABA permease. The use of amino acids as carbon and nitrogen sources, releasing ammonium and generating carbohydrate metabolism intermediaries, also explains the high expression of components of several degradative pathways, since glucose starvation is an important host defense mechanism.
Teixeira, Miguel Cacho; Monteiro, Pedro Tiago; Guerreiro, Joana Fernandes; Gonçalves, Joana Pinho; Mira, Nuno Pereira; dos Santos, Sandra Costa; Cabrito, Tânia Rodrigues; Palma, Margarida; Costa, Catarina; Francisco, Alexandre Paulo; Madeira, Sara Cordeiro; Oliveira, Arlindo Limede; Freitas, Ana Teresa; Sá-Correia, Isabel
The YEASTRACT (http://www.yeastract.com) information system is a tool for the analysis and prediction of transcription regulatory associations in Saccharomyces cerevisiae. Last updated in June 2013, this database contains over 200,000 regulatory associations between transcription factors (TFs) and target genes, including 326 DNA binding sites for 113 TFs. All regulatory associations stored in YEASTRACT were revisited and new information was added on the experimental conditions in which those associations take place and on whether the TF is acting on its target genes as activator or repressor. Based on this information, new queries were developed allowing the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. This release further offers tools to rank the TFs controlling a gene or genome-wide response by their relative importance, based on (i) the percentage of target genes in the data set; (ii) the enrichment of the TF regulon in the data set when compared with the genome; or (iii) the score computed using the TFRank system, which selects and prioritizes the relevant TFs by walking through the yeast regulatory network. We expect that with the new data and services made available, the system will continue to be instrumental for yeast biologists and systems biology researchers.
Rajani Kanth Vangala
Full Text Available Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.
Vangala, Rajani Kanth; Ravindran, Vandana; Ghatge, Madan; Shanker, Jayashree; Arvind, Prathima; Bindu, Hima; Shekar, Meghala; Rao, Veena S
Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.
Charfeddine, Mariam; Saïdi, Mohamed Najib; Charfeddine, Safa; Hammami, Asma; Gargouri Bouzid, Radhia
The ERF transcription factors belong to the AP2/ERF superfamily, one of the largest transcription factor families in plants. They play important roles in plant development processes, as well as in the response to biotic, abiotic, and hormone signaling. In the present study, 155 putative ERF transcription factor genes were identified from the potato (Solanum tuberosum) genome database, and compared with those from Arabidopsis thaliana. The StERF proteins are divided into ten phylogenetic groups. Expression analyses of five StERFs were carried out by semi-quantitative RT-PCR and compared with published RNA-seq data. These latter analyses were used to distinguish tissue-specific, biotic, and abiotic stress genes as well as hormone-responsive StERF genes. The results are of interest to better understand the role of the AP2/ERF genes in response to diverse types of stress in potatoes. A comprehensive analysis of the physiological functions and biological roles of the ERF family genes in S. tuberosum is required to understand crop stress tolerance mechanisms.
Full Text Available The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max. In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Full Text Available The high level of physiological fruitlet abscission in litchi (Litchi chinensis Sonn. causes severe yield loss. Cell separation occurs at the fruit abscission zone (FAZ and can be triggered by ethylene. However, a deep knowledge of the molecular events occurring in the FAZ is still unknown. Here, genome-wide digital transcript abundance (DTA analysis of putative fruit abscission related genes regulated by ethephon in litchi were studied. More than 81 million high quality reads from seven ethephon treated and untreated control libraries were obtained by high-throughput sequencing. Through DTA profile analysis in combination with Gene Ontology and KEGG pathway enrichment analyses, a total of 2,730 statistically significant candidate genes were involved in the ethephon-promoted litchi fruitlet abscission. Of these, there were 1,867 early-responsive genes whose expressions were up- or down-regulated from 0 to 1 d after treatment. The most affected genes included those related to ethylene biosynthesis and signaling, auxin transport and signaling, transcription factors, protein ubiquitination, ROS response, calcium signal transduction and cell wall modification. These genes could be clustered into 4 groups and 13 subgroups according to their similar expression patterns. qRT-PCR displayed the expression pattern of 41 selected candidate genes, which proved the accuracy of our DTA data. Ethephon treatment significantly increased fruit abscission and ethylene production of fruitlet. The possible molecular events to control the ethephon-promoted litchi fruitlet abscission were prompted out. The increased ethylene evolution in fruitlet would suppress the synthesis and polar transport of auxin and trigger abscission signaling. To the best of our knowledge, it is the first time to monitor the gene expression profile occurring in the FAZ-enriched pedicel during litchi fruit abscission induced by ethephon on the genome-wide level. This study will contribute to
Washio, T.; Sasayama, J; Tomita, M.
Free energy values of mRNA tertiary structures around stop codons were systematically calculated to surmise the hairpin-forming potential for all genes in each of the 16 complete prokaryote genomes. Instead of trying to detect each individual hairpin, we averaged the free energy values around the stop codons over the entire genome to predict how extensively the organism relies on hairpin formation in the process of transcription termination. The free energy values of Escherichia coli K-12 sho...
Yongchun Song, Chengxue Dang, Yebo Fu, Yi Lian, Jenny Hottel, Xuelan Li, Tim McCaffrey, Sidney W. Fu
Full Text Available Homeobox genes are known to be critically important in tumor development and progression. The BP1 (Beta Protein 1 gene, an isoform of DLX4, belongs to the Distal-less (DLX subfamily of homeobox genes and encodes a homeodomain-containing transcription factor. Our studies have shown that the BP1 gene was overexpressed in 81% of primary breast cancer and its expression was closely correlated with the progression of breast cancer. However, the exact role of BP1 in breast has yet to be elucidated. Therefore, it is important to explore the potential transcriptional targets of BP1 via whole genome-scale screening. In this study, we used the chromatin immunoprecipitation on chip (ChIP-on-chip and gene expression microarray assays to identify candidate target genes and gene networks, which are directly regulated by BP1 in ER negative (ER- breast cancer cells. After rigorous bioinformatic and statistical analysis for both ChIP-on-chip and expression microarray gene lists, 18 overlapping genes were noted and verified. Those potential target genes are involved in a variety of tumorigenic pathways, which sheds light on the functional mechanisms of BP1 in breast cancer development and progression.
Huang, X Y; Tao, P; Li, B Y; Wang, W H; Yue, Z C; Lei, J L; Zhong, X M
Chinese cabbage (Brassica rapa ssp. pekinensis) is one of the most important vegetable crops grown worldwide, and various methods exist for selection, propagation, and cultivation. The entire Chinese cabbage genome has been sequenced, and the heat shock transcription factor family (Hsfs) has been found to play a central role in plant growth and development and in the response to biotic and abiotic stress conditions, particularly in acquired thermotolerance. We analyzed heat tolerance mechanisms in Chinese cabbage. In this study, 30 Hsfs were identified from the Chinese cabbage genome database. The classification, phylogenetic reconstruction, chromosome distribution, conserved motifs, expression analysis, and interaction networks of the Hsfs were predicted and analyzed. Thirty BrHsfs were classified into 3 major classes (class A, B, and C) according to their structural characteristics and phylogenetic comparisons, and class A was further subdivided into 8 subclasses. Distribution mapping results showed that Hsf genes were located on 10 Chinese cabbage chromosomes. The expression profile indicated that Hsfs play differential roles in 5 organs in Chinese cabbage, and likely participate in the development of underground parts and regulation of reproductive growth. An orthologous gene interaction network was constructed, and included MBF1C, ROF1, TBP2, CDC2, and HSP70 5 genes, which are closely related to heat stress. Our results contribute to the understanding of the complexity of Hsfs in Chinese cabbage and provide a basis for further functional gene research.
Full Text Available Differential transcription in Ascaris suum was investigated using a genomic-bioinformatic approach. A cDNA archive enriched for molecules in the infective third-stage larva (L3 of A. suum was constructed by suppressive-subtractive hybridization (SSH, and a subset of cDNAs from 3075 clones subjected to microarray analysis using cDNA probes derived from RNA from different developmental stages of A. suum. The cDNAs (n = 498 shown by microarray analysis to be enriched in the L3 were sequenced and subjected to bioinformatic analyses using a semi-automated pipeline (ESTExplorer. Using gene ontology (GO, 235 of these molecules were assigned to 'biological process' (n = 68, 'cellular component' (n = 50, or 'molecular function' (n = 117. Of the 91 clusters assembled, 56 molecules (61.5% had homologues/orthologues in the free-living nematodes Caenorhabditis elegans and C. briggsae and/or other organisms, whereas 35 (38.5% had no significant similarity to any sequences available in current gene databases. Transcripts encoding protein kinases, protein phosphatases (and their precursors, and enolases were abundantly represented in the L3 of A. suum, as were molecules involved in cellular processes, such as ubiquitination and proteasome function, gene transcription, protein-protein interactions, and function. In silico analyses inferred the C. elegans orthologues/homologues (n = 50 to be involved in apoptosis and insulin signaling (2%, ATP synthesis (2%, carbon metabolism (6%, fatty acid biosynthesis (2%, gap junction (2%, glucose metabolism (6%, or porphyrin metabolism (2%, although 34 (68% of them could not be mapped to a specific metabolic pathway. Small numbers of these 50 molecules were predicted to be secreted (10%, anchored (2%, and/or transmembrane (12% proteins. Functionally, 17 (34% of them were predicted to be associated with (non-wild-type RNAi phenotypes in C. elegans, the majority being embryonic lethality (Emb (13 types; 58.8%, larval arrest
Full Text Available Abstract Background Anaplasma phagocytophilum (Ap is an obligate intracellular bacterium and the agent of human granulocytic anaplasmosis, an emerging tick-borne disease. Ap alternately infects ticks and mammals and a variety of cell types within each. Understanding the biology behind such versatile cellular parasitism may be derived through the use of tiling microarrays to establish high resolution, genome-wide transcription profiles of the organism as it infects cell lines representative of its life cycle (tick; ISE6 and pathogenesis (human; HL-60 and HMEC-1. Results Detailed, host cell specific transcriptional behavior was revealed. There was extensive differential Ap gene transcription between the tick (ISE6 and the human (HL-60 and HMEC-1 cell lines, with far fewer differentially transcribed genes between the human cell lines, and all disproportionately represented by membrane or surface proteins. There were Ap genes exclusively transcribed in each cell line, apparent human- and tick-specific operons and paralogs, and anti-sense transcripts that suggest novel expression regulation processes. Seven virB2 paralogs (of the bacterial type IV secretion system showed human or tick cell dependent transcription. Previously unrecognized genes and coding sequences were identified, as were the expressed p44/msp2 (major surface proteins paralogs (of 114 total, through elevated signal produced to the unique hypervariable region of each – 2/114 in HL-60, 3/114 in HMEC-1, and none in ISE6. Conclusion Using these methods, whole genome transcription profiles can likely be generated for Ap, as well as other obligate intracellular organisms, in any host cells and for all stages of the cell infection process. Visual representation of comprehensive transcription data alongside an annotated map of the genome renders complex transcription into discernable patterns.
Shi, Pibiao; Guy, Kateta Malangisha; Wu, Weifang; Fang, Bingsheng; Yang, Jinghua; Zhang, Mingfang; Hu, Zhongyuan
The plant-specific TCP transcription factor family, which is involved in the regulation of cell growth and proliferation, performs diverse functions in multiple aspects of plant growth and development. However, no comprehensive analysis of the TCP family in watermelon (Citrullus lanatus) has been undertaken previously. A total of 27 watermelon TCP encoding genes distributed on nine chromosomes were identified. Phylogenetic analysis clustered the genes into 11 distinct subgroups. Furthermore, phylogenetic and structural analyses distinguished two homology classes within the ClTCP family, designated Class I and Class II. The Class II genes were differentiated into two subclasses, the CIN subclass and the CYC/TB1 subclass. The expression patterns of all members were determined by semi-quantitative PCR. The functions of two ClTCP genes, ClTCP14a and ClTCP15, in regulating plant height were confirmed by ectopic expression in Arabidopsis wild-type and ortholog mutants. This study represents the first genome-wide analysis of the watermelon TCP gene family, which provides valuable information for understanding the classification and functions of the TCP genes in watermelon.
Beller, H R; Letain, T E; Chakicherla, A; Kane, S R; Legler, T C; Coleman, M A
Thiobacillus denitrificans is one of the few known obligate chemolithoautotrophic bacteria capable of energetically coupling thiosulfate oxidation to denitrification as well as aerobic respiration. As very little is known about the differential expression of genes associated with ke chemolithoautotrophic functions (such as sulfur-compound oxidation and CO2 fixation) under aerobic versus denitrifying conditions, we conducted whole-genome, cDNA microarray studies to explore this topic systematically. The microarrays identified 277 genes (approximately ten percent of the genome) as differentially expressed using Robust Multi-array Average statistical analysis and a 2-fold cutoff. Genes upregulated (ca. 6- to 150-fold) under aerobic conditions included a cluster of genes associated with iron acquisition (e.g., siderophore-related genes), a cluster of cytochrome cbb3 oxidase genes, cbbL and cbbS (encoding the large and small subunits of form I ribulose 1,5-bisphosphate carboxylase/oxygenase, or RubisCO), and multiple molecular chaperone genes. Genes upregulated (ca. 4- to 95-fold) under denitrifying conditions included nar, nir, and nor genes (associated respectively with nitrate reductase, nitrite reductase, and nitric oxide reductase, which catalyze successive steps of denitrification), cbbM (encoding form II RubisCO), and genes involved with sulfur-compound oxidation (including two physically separated but highly similar copies of sulfide:quinone oxidoreductase and of dsrC, associated with dissimilatory sulfite reductase). Among genes associated with denitrification, relative expression levels (i.e., degree of upregulation with nitrate) tended to decrease in the order nar > nir > nor > nos. Reverse transcription, quantitative PCR analysis was used to validate these trends.
Manjunath, Siddappa; Kumar, Gandham Ravi; Mishra, Bishnu Prasad; Mishra, Bina; Sahoo, Aditya Prasad; Joshi, Chaitanya G; Tiwari, Ashok K; Rajak, Kaushal Kishore; Janga, Sarath Chandra
Peste des petits ruminants (PPR), is an acute transboundary viral disease of economic importance, affecting goats and sheep. Mass vaccination programs around the world resulted in the decline of PPR outbreaks. Sungri 96 is a live attenuated vaccine, widely used in Northern India against PPR. This vaccine virus, isolated from goat works efficiently both in sheep and goat. Global gene expression changes under PPR vaccine virus infection are not yet well defined. Therefore, in this study we investigated the host-vaccine virus interactions by infecting the peripheral blood mononuclear cells isolated from goat with PPRV (Sungri 96 vaccine virus), to quantify the global changes in the transcriptomic signature by RNA-sequencing. Viral genome of Sungri 96 vaccine virus was assembled from the PPRV infected transcriptome confirming the infection and demonstrating the feasibility of building a complete non-host genome from the blood transcriptome. Comparison of infected transcriptome with control transcriptome revealed 985 differentially expressed genes. Functional analysis showed enrichment of immune regulatory pathways under PPRV infection. Key genes involved in immune system regulation, spliceosomal and apoptotic pathways were identified to be dysregulated. Network analysis revealed that the protein - protein interaction network among differentially expressed genes is significantly disrupted in infected state. Several genes encoding TFs that govern immune regulatory pathways were identified to co-regulate the differentially expressed genes. These data provide insights into the host - PPRV vaccine virus interactome for the first time. Our findings suggested dysregulation of immune regulatory pathways and genes encoding Transcription Factors (TFs) that govern these pathways in response to viral infection.
Yanovsky Marcelo J
Full Text Available Abstract Background Plants use different light signals to adjust their growth and development to the prevailing environmental conditions. Studies in the model species Arabidopsis thaliana and rice indicate that these adjustments are mediated by large changes in the transcriptome. Here we compared transcriptional responses to light in different species of the Solanaceae to investigate common as well as species-specific changes in gene expression. Results cDNA microarrays were used to identify genes regulated by a transition from long days (LD to short days (SD in the leaves of potato and tobacco plants, and by phytochrome B (phyB, the photoreceptor that represses tuberization under LD in potato. We also compared transcriptional responses to photoperiod in Nicotiana tabacum Maryland Mammoth (MM, which flowers only under SD, with those of Nicotiana sylvestris, which flowers only under LD conditions. Finally, we identified genes regulated by red compared to far-red light treatments that promote germination in tomato. Conclusion Most of the genes up-regulated in LD were associated with photosynthesis, the synthesis of protective pigments and the maintenance of redox homeostasis, probably contributing to the acclimatization to seasonal changes in irradiance. Some of the photoperiodically regulated genes were the same in potato and tobacco. Others were different but belonged to similar functional categories, suggesting that conserved as well as convergent evolutionary processes are responsible for physiological adjustments to seasonal changes in the Solanaceae. A β-ZIP transcription factor whose expression correlated with the floral transition in Nicotiana species with contrasting photoperiodic responses was also regulated by photoperiod and phyB in potato, and is a candidate gene to act as a general regulator of photoperiodic responses. Finally, GIGANTEA, a gene that controls flowering time in Arabidopsis thaliana and rice, was regulated by
Kelly, Scott A; Nehrenberg, Derrick L; Hua, Kunjie; Garland, Theodore; Pomp, Daniel
Motivation and ability both underlie voluntary exercise, each with a potentially unique genetic architecture. Muscle structure and function are one of many morphological and physiological systems acting to simultaneously determine exercise ability. We generated a large (n = 815) advanced intercross line of mice (G4) derived from a line selectively bred for increased wheel running (high runner) and the C57BL/6J inbred strain. We previously mapped quantitative trait loci (QTL) contributing to voluntary exercise, body composition, and changes in body composition as a result of exercise. Using brain tissue in a subset of the G4 (n = 244), we have also previously reported expression QTL (eQTL) colocalizing with the QTL for the higher-level phenotypes. Here, we examined the transcriptional landscape of hind limb muscle tissue via global mRNA expression profiles. Correlations revealed an ∼1,168% increase in significant relationships between muscle transcript expression levels and the same exercise and body composition phenotypes examined previously in the brain. The exercise trait most often significantly correlated with gene expression in the brain was running duration while in the muscle it was maximum running speed. This difference may indicate that time spent engaging in exercise behavior may be more influenced by central (neurobiological) mechanisms, while intensity of exercise may be largely controlled by peripheral mechanisms. Additionally, we used subsets of cis-acting eQTL, colocalizing with QTL, to identify candidate genes based on both positional and functional evidence. We discuss three plausible candidate genes (Insig2, Prcp, Sparc) and their potential regulatory role.
Kotova E. S.
Full Text Available The CTCF transcription factor is thought to be one of the main participants in various gene regulatory networks including transcription activation and repression, formation of independently functioning chromatin domains, regulation of imprinting etc. Sequencing of human and other genomes opened up a possibility to ascertain the genomic distribution of CTCF binding sites and to identify CTCF-dependent cis-regulatory elements, including insulators. In the review, we summarized recent data on CTCF functioning within a framework of the chromatin loop domain hypothesis of large-scale regulation of the genome activity. Its fundamental properties allow CTCF to serve as a transcription factor, an insulator protein and a dispersed genome-wide demarcation tool able to recruit various factors that emerge in response to diverse external and internal signals, and thus to exert its signal-specific function(s.
Yen, Judy Y; Garamszegi, Sara; Geisbert, Joan B; Rubins, Kathleen H; Geisbert, Thomas W; Honko, Anna; Xia, Yu; Connor, John H; Hensley, Lisa E
The mechanisms of Ebola (EBOV) pathogenesis are only partially understood, but the dysregulation of normal host immune responses (including destruction of lymphocytes, increases in circulating cytokine levels, and development of coagulation abnormalities) is thought to play a major role. Accumulating evidence suggests that much of the observed pathology is not the direct result of virus-induced structural damage but rather is due to the release of soluble immune mediators from EBOV-infected cells. It is therefore essential to understand how the candidate therapeutic may be interrupting the disease process and/or targeting the infectious agent. To identify genetic signatures that are correlates of protection, we used a DNA microarray-based approach to compare the host genome-wide responses of EBOV-infected nonhuman primates (NHPs) responding to candidate therapeutics. We observed that, although the overall circulating immune response was similar in the presence and absence of coagulation inhibitors, surviving NHPs clustered together. Noticeable differences in coagulation-associated genes appeared to correlate with survival, which revealed a subset of distinctly differentially expressed genes, including chemokine ligand 8 (CCL8/MCP-2), that may provide possible targets for early-stage diagnostics or future therapeutics. These analyses will assist us in understanding the pathogenic mechanisms of EBOV infection and in identifying improved therapeutic strategies.
Bochkis, Irina M; Schug, Jonathan; Ye, Diana Z; Kurinna, Svitlana; Stratton, Sabrina A; Barton, Michelle C; Kaestner, Klaus H
Gene duplication is a powerful driver of evolution. Newly duplicated genes acquire new roles that are relevant to fitness, or they will be lost over time. A potential path to functional relevance is mutation of the coding sequence leading to the acquisition of novel biochemical properties, as analyzed here for the highly homologous paralogs Foxa1 and Foxa2 transcriptional regulators. We determine by genome-wide location analysis (ChIP-Seq) that, although Foxa1 and Foxa2 share a large fraction of binding sites in the liver, each protein also occupies distinct regulatory elements in vivo. Foxa1-only sites are enriched for p53 binding sites and are frequently found near genes important to cell cycle regulation, while Foxa2-restricted sites show only a limited match to the forkhead consensus and are found in genes involved in steroid and lipid metabolism. Thus, Foxa1 and Foxa2, while redundant during development, have evolved divergent roles in the adult liver, ensuring the maintenance of both genes during evolution.
Irina M Bochkis
Full Text Available Gene duplication is a powerful driver of evolution. Newly duplicated genes acquire new roles that are relevant to fitness, or they will be lost over time. A potential path to functional relevance is mutation of the coding sequence leading to the acquisition of novel biochemical properties, as analyzed here for the highly homologous paralogs Foxa1 and Foxa2 transcriptional regulators. We determine by genome-wide location analysis (ChIP-Seq that, although Foxa1 and Foxa2 share a large fraction of binding sites in the liver, each protein also occupies distinct regulatory elements in vivo. Foxa1-only sites are enriched for p53 binding sites and are frequently found near genes important to cell cycle regulation, while Foxa2-restricted sites show only a limited match to the forkhead consensus and are found in genes involved in steroid and lipid metabolism. Thus, Foxa1 and Foxa2, while redundant during development, have evolved divergent roles in the adult liver, ensuring the maintenance of both genes during evolution.
Zhang, Xiuqing; Yang, Huanming; Yu, Jun
complementary to RNAs of the Nup155 orthologs from Fugu and mouse. Comparative analysis of the Nup155 orthologs in many species, including H. sapiens, Mus musculus, Rattus norvegicus, F. rubripes, Arabidopsis thaliana, Drosophila melanogaster, and Saccharomyces cerevisiae, has revealed two paralogs in S...
Washio, T; Sasayama, J; Tomita, M
Free energy values of mRNA tertiary structures around stop codons were systematically calculated to surmise the hairpin-forming potential for all genes in each of the 16 complete prokaryote genomes. Instead of trying to detect each individual hairpin, we averaged the free energy values around the stop codons over the entire genome to predict how extensively the organism relies on hairpin formation in the process of transcription termination. The free energy values of Escherichia coli K-12 shows a sharp drop, as expected, at 30 bp downstream of the stop codon, presumably due to hairpin-forming sequences. Similar drops are observed for Haemophilus influenzae Rd, Bacillus subtilis and Chlamydia trachomatis, suggesting that these organisms also form hairpins at their transcription termination sites. On the other hand, 12 other prokaryotes- Mycoplasma genitalium, Mycoplasma pneumoniae, Synechocystis PCC6803, Helicobacter pylori, Borrelia burgdorferi, Methanococcus jannaschii, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, Aquifex aeolicus, Pyrococcus horikoshii, Mycobacterium tuberculosis and Treponema pallidum -show no apparent decrease in free energy value at the corresponding regions. This result suggests that these prokaryotes, or at least some of them, may never form hairpins at their transcription termination sites.
Cenci, Albero; Guignon, Valentin; Roux, Nicolas; Rouard, Mathieu
Identifying the molecular mechanisms underlying tolerance to abiotic stresses is important in crop breeding. A comprehensive understanding of the gene families associated with drought tolerance is therefore highly relevant. NAC transcription factors form a large plant-specific gene family involved in the regulation of tissue development and responses to biotic and abiotic stresses. The main goal of this study was to set up a framework of orthologous groups determined by an expert sequence comparison of NAC genes from both monocots and dicots. In order to clarify the orthologous relationships among NAC genes of different species, we performed an in-depth comparative study of four divergent taxa, in dicots and monocots, whose genomes have already been completely sequenced: Arabidopsis thaliana, Vitis vinifera, Musa acuminata and Oryza sativa. Due to independent evolution, NAC copy number is highly variable in these plant genomes. Based on an expert NAC sequence comparison, we propose forty orthologous groups of NAC sequences that were probably derived from an ancestor gene present in the most recent common ancestor of dicots and monocots. These orthologous groups provide a curated resource for large-scale protein sequence annotation of NAC transcription factors. The established orthology relationships also provide a useful reference for NAC function studies in newly sequenced genomes such as M. acuminata and other plant species.
Li, Lei; Wang, Xiangfeng; Stolc, Viktor;
Sequencing and computational annotation revealed several features, including high gene numbers, unusual composition of the predicted genes and a large number of genes lacking homology to known genes, that distinguish the rice (Oryza sativa) genome from that of other fully sequenced model species....... We report here a full-genome transcription analysis of the indica rice subspecies using high-density oligonucleotide tiling microarrays. Our results provided expression data support for the existence of 35,970 (81.9%) annotated gene models and identified 5,464 unique transcribed intergenic regions...... activity between duplicated segments of the genome. Collectively, our results provide the first whole-genome transcription map useful for further understanding the rice genome. Udgivelsesdato: 2006-Jan...
Tanaka, Yuya; Takemoto, Norihiko; Ito, Terukazu; Teramoto, Haruhiko; Yukawa, Hideaki; Inui, Masayuki
The transcriptional regulator GntR1 downregulates the genes for gluconate catabolism and pentose phosphate pathway in Corynebacterium glutamicum. Gluconate lowers the DNA binding affinity of GntR1, which is probably the mechanism of gluconate-dependent induction of these genes. In addition, GntR1 positively regulates ptsG, a gene encoding a major glucose transporter, and pck, a gene encoding phosphoenolpyruvate carboxykinase. Here, we searched for the new target of GntR1 on a genome-wide scal...
Lakhina, Vanisha; Arey, Rachel N.; Kaletsky, Rachel; Kauffman, Amanda; Stein, Geneva; Keyes, William; Xu, Daniel; Murphy, Coleen T.
SUMMARY Induced CREB activity is a hallmark of long-term memory, but the full repertoire of CREB transcriptional targets required specifically for memory is not known in any system. To obtain a more complete picture of the mechanisms involved in memory, we combined memory training with genome-wide transcriptional analysis of C. elegans CREB mutants. This approach identified 757 significant CREB/memory-induced targets and confirmed the involvement of known memory genes from other organisms, but also suggested new mechanisms and novel components that may be conserved through mammals. CREB mediates distinct basal and memory transcriptional programs at least partially through spatial restriction of CREB activity: basal targets are regulated primarily in nonneuronal tissues, while memory targets are enriched for neuronal expression, emanating from CREB activity in AIM neurons. This suite of novel memory-associated genes will provide a platform for the discovery of orthologous mammalian long-term memory components. PMID:25611510
Lakhina, Vanisha; Arey, Rachel N; Kaletsky, Rachel; Kauffman, Amanda; Stein, Geneva; Keyes, William; Xu, Daniel; Murphy, Coleen T
Induced CREB activity is a hallmark of long-term memory, but the full repertoire of CREB transcriptional targets required specifically for memory is not known in any system. To obtain a more complete picture of the mechanisms involved in memory, we combined memory training with genome-wide transcriptional analysis of C. elegans CREB mutants. This approach identified 757 significant CREB/memory-induced targets and confirmed the involvement of known memory genes from other organisms, but also suggested new mechanisms and novel components that may be conserved through mammals. CREB mediates distinct basal and memory transcriptional programs at least partially through spatial restriction of CREB activity: basal targets are regulated primarily in nonneuronal tissues, while memory targets are enriched for neuronal expression, emanating from CREB activity in AIM neurons. This suite of novel memory-associated genes will provide a platform for the discovery of orthologous mammalian long-term memory components.
Quach, Bryan; Furey, Terrence S
Identifying the locations of transcription factor binding sites is critical for understanding how gene transcription is regulated across different cell types and conditions. Chromatin accessibility experiments such as DNaseI sequencing (DNase-seq) and Assay for Transposase Accessible Chromatin sequencing (ATAC-seq) produce genome-wide data that include distinct 'footprint' patterns at binding sites. Nearly all existing computational methods to detect footprints from these data assume that footprint signals are highly homogeneous across footprint sites. Additionally, a comprehensive and systematic comparison of footprinting methods for specifically identifying which motif sites for a specific factor are bound has not been performed. Using DNase-seq data from the ENCODE project, we show that a large degree of previously uncharacterized site-to-site variability exists in footprint signal across motif sites for a transcription factor. To model this heterogeneity in the data, we introduce a novel, supervised learning footprinter called Detecting Footprints Containing Motifs (DeFCoM). We compare DeFCoM to nine existing methods using evaluation sets from four human cell-lines and eighteen transcription factors and show that DeFCoM outperforms current methods in determining bound and unbound motif sites. We also analyze the impact of several biological and technical factors on the quality of footprint predictions to highlight important considerations when conducting footprint analyses and assessing the performance of footprint prediction methods. Finally, we show that DeFCoM can detect footprints using ATAC-seq data with similar accuracy as when using DNase-seq data. Python code available at https://bitbucket.org/bryancquach/defcom. email@example.com or firstname.lastname@example.org. Supplementary data are available at Bioinformatics online.
Sánchez, Yolanda; Segura, Victor; Marín-Béjar, Oskar; Athie, Alejandro; Marchese, Francesco P; González, Jovanna; Bujanda, Luis; Guo, Shuling; Matheu, Ander; Huarte, Maite
Despite the inarguable relevance of p53 in cancer, genome-wide studies relating endogenous p53 activity to the expression of lncRNAs in human cells are still missing. Here, by integrating RNA-seq with p53 ChIP-seq analyses of a human cancer cell line under DNA damage, we define a high-confidence set of 18 lncRNAs that are p53 transcriptional targets. We demonstrate that two of the p53-regulated lncRNAs are required for the efficient binding of p53 to some of its target genes, modulating the p53 transcriptional network and contributing to apoptosis induction by DNA damage. We also show that the expression of p53-lncRNAs is lowered in colorectal cancer samples, constituting a tumour suppressor signature with high diagnostic power. Thus, p53-regulated lncRNAs establish a positive regulatory feedback loop that enhances p53 tumour suppressor activity. Furthermore, the signature defined by p53-regulated lncRNAs supports their potential use in the clinic as biomarkers and therapeutic targets.
Walia, Harkamal; Wilson, Clyde; Zeng, Linghe; Ismail, Abdelbagi M; Condamine, Pascal; Close, Timothy J
Rice yield is most sensitive to salinity stress imposed during the panicle initiation (PI) stage. In this study, we have focused on physiological and transcriptional responses of four rice genotypes exposed to salinity stress during PI. The genotypes selected included a pair of indicas (IR63731 and IR29) and a pair of japonica (Agami and M103) rice subspecies with contrasting salt tolerance. Physiological characterization showed that tolerant genotypes maintained a much lower shoot Na+ concentration relative to sensitive genotypes under salinity stress. Global gene expression analysis revealed a strikingly large number of genes which are induced by salinity stress in sensitive genotypes, IR29 and M103 relative to tolerant lines. We found 19 probe sets to be commonly induced in all four genotypes. We found several salinity modulated, ion homeostasis related genes from our analysis. We also studied the expression of SKC1, a cation transporter reported by others as a major source of variation in salt tolerance in rice. The transcript abundance of SKC1 did not change in response to salinity stress at PI stage in the shoot tissue of all four genotypes. However, we found the transcript abundance of SKC1 to be significantly higher in tolerant japonica Agami relative to sensitive japonica M103 under control and stressed conditions during PI stage.
Francisco Castaneda, Sigrid Rosin-Steiner, Klaus Jung
Full Text Available We previously found that ethanol at millimolar level (1 mM activates the expression of transcription factors with subsequent regulation of apoptotic genes in human hepatocellular carcinoma (HCC HepG2 cells. However, the role of ethanol on the expression of genes implicated in transcriptional and translational processes remains unknown. Therefore, the aim of this study was to characterize the effect of low concentration of ethanol on gene expression profiling in HepG2 cells using cDNA microarrays with especial interest in genes with transcriptional and translational function. The gene expression pattern observed in the ethanol-treated HepG2 cells revealed a relatively similar pattern to that found in the untreated control cells. The pairwise comparison analysis demonstrated four significantly up-regulated (COBRA1, ITGB4, STAU2, and HMGN3 genes and one down-regulated (ANK3 gene. All these genes exert their function on transcriptional and translational processes and until now none of these genes have been associated with ethanol. This functional genomic analysis demonstrates the reported interaction between ethanol and ethanol-regulated genes. Moreover, it confirms the relationship between ethanol-regulated genes and various signaling pathways associated with ethanol-induced apoptosis. The data presented in this study represents an important contribution toward the understanding of the molecular mechanisms of ethanol at low concentration in HepG2 cells, a HCC-derived cell line.
Janet L Smith
Full Text Available DnaA, the replication initiation protein in bacteria, is an AAA+ ATPase that binds and hydrolyzes ATP and exists in a heterogeneous population of ATP-DnaA and ADP-DnaA. DnaA binds cooperatively to the origin of replication and several other chromosomal regions, and functions as a transcription factor at some of these regions. We determined the binding properties of Bacillus subtilis DnaA to genomic DNA in vitro at single nucleotide resolution using in vitro DNA affinity purification and deep sequencing (IDAP-Seq. We used these data to identify 269 binding regions, refine the consensus sequence of the DnaA binding site, and compare the relative affinity of binding regions for ATP-DnaA and ADP-DnaA. Most sites had a slightly higher affinity for ATP-DnaA than ADP-DnaA, but a few had a strong preference for binding ATP-DnaA. Of the 269 sites, only the eight strongest binding ones have been observed to bind DnaA in vivo, suggesting that other cellular factors or the amount of available DnaA in vivo restricts DnaA binding to these additional sites. Conversely, we found several chromosomal regions that were bound by DnaA in vivo but not in vitro, and that the nucleoid-associated protein Rok was required for binding in vivo. Our in vitro characterization of the inherent ability of DnaA to bind the genome at single nucleotide resolution provides a backdrop for interpreting data on in vivo binding and regulation of DnaA, and is an approach that should be adaptable to many other DNA binding proteins.
Pandey, Ashutosh; Alok, Anshu; Lakhwani, Deepika; Singh, Jagdeep; Asif, Mehar H; Trivedi, Prabodh K
Flavonoid biosynthesis is largely regulated at the transcriptional level due to the modulated expression of genes related to the phenylpropanoid pathway in plants. Although accumulation of different flavonoids has been reported in banana, a staple fruit crop, no detailed information is available on regulation of the biosynthesis in this important plant. We carried out genome-wide analysis of banana (Musa acuminata, AAA genome) and identified 28 genes belonging to 9 gene families associated with flavonoid biosynthesis. Expression analysis suggested spatial and temporal regulation of the identified genes in different tissues of banana. Analysis revealed enhanced expression of genes related to flavonol and proanthocyanidin (PA) biosynthesis in peel and pulp at the early developmental stages of fruit. Genes involved in anthocyanin biosynthesis were highly expressed during banana fruit ripening. In general, higher accumulation of metabolites was observed in the peel as compared to pulp tissue. A correlation between expression of genes and metabolite content was observed at the early stage of fruit development. Furthermore, this study also suggests regulation of flavonoid biosynthesis, at transcriptional level, under light and dark exposures as well as methyl jasmonate (MJ) treatment in banana.
Pandey, Ashutosh; Alok, Anshu; Lakhwani, Deepika; Singh, Jagdeep; Asif, Mehar H.; Trivedi, Prabodh K.
Flavonoid biosynthesis is largely regulated at the transcriptional level due to the modulated expression of genes related to the phenylpropanoid pathway in plants. Although accumulation of different flavonoids has been reported in banana, a staple fruit crop, no detailed information is available on regulation of the biosynthesis in this important plant. We carried out genome-wide analysis of banana (Musa acuminata, AAA genome) and identified 28 genes belonging to 9 gene families associated with flavonoid biosynthesis. Expression analysis suggested spatial and temporal regulation of the identified genes in different tissues of banana. Analysis revealed enhanced expression of genes related to flavonol and proanthocyanidin (PA) biosynthesis in peel and pulp at the early developmental stages of fruit. Genes involved in anthocyanin biosynthesis were highly expressed during banana fruit ripening. In general, higher accumulation of metabolites was observed in the peel as compared to pulp tissue. A correlation between expression of genes and metabolite content was observed at the early stage of fruit development. Furthermore, this study also suggests regulation of flavonoid biosynthesis, at transcriptional level, under light and dark exposures as well as methyl jasmonate (MJ) treatment in banana. PMID:27539368
Valen, Eivind; Sandelin, Albin
A central question in cellular biology is how the cell regulates transcription and discerns when and where to initiate it. Locating transcription start sites (TSSs), the signals that specify them, and ultimately elucidating the mechanisms of regulated initiation has therefore been a recurrent theme. In recent years substantial progress has been made towards this goal, spurred by the possibility of applying genome-wide, sequencing-based analysis. We now have a large collection of high-resolution datasets identifying locations of TSSs, protein-DNA interactions, and chromatin features over whole genomes; the field is now faced with the daunting challenge of translating these descriptive maps into quantitative and predictive models describing the underlying biology. We review here the genomic and chromatin features that underlie TSS selection and usage, focusing on the differences between the major classes of core promoters. Copyright © 2011 Elsevier Ltd. All rights reserved.
Wei, Yunxie; Hu, Wei; Xia, Feiyu; Zeng, Hongqiu; Li, Xiaolin; Yan, Yu; He, Chaozu; Shi, Haitao
Banana (Musa acuminata) is one of the most popular fresh fruits. However, the rapid spread of fungal pathogen Fusarium oxysporum f. sp. cubense (Foc) in tropical areas severely affected banana growth and production. Thus, it is very important to identify candidate genes involved in banana response to abiotic stress and pathogen infection, as well as the molecular mechanism and possible utilization for genetic breeding. Heat stress transcription factors (Hsfs) are widely known for their common involvement in various abiotic stresses and plant-pathogen interaction. However, no MaHsf has been identified in banana, as well as its possible role. In this study, genome-wide identification and further analyses of evolution, gene structure and conserved motifs showed closer relationship of them in every subgroup. The comprehensive expression profiles of MaHsfs revealed the tissue- and developmental stage-specific or dependent, as well as abiotic and biotic stress-responsive expressions of them. The common regulation of several MaHsfs by abiotic and biotic stress indicated the possible roles of them in plant stress responses. Taken together, this study extended our understanding of MaHsf gene family and identified some candidate MaHsfs with specific expression profiles, which may be used as potential candidates for genetic breeding in banana. PMID:27857174
Wang, Jinyan; Hu, Zhongze; Zhao, Tongmin; Yang, Yuwen; Chen, Tianzi; Yang, Mali; Yu, Wengui; Zhang, Baolong
The basic helix-loop-helix (bHLH) proteins are a superfamily of transcription factors that can bind to specific DNA target sites. They have been well characterized in model plants such as Arabidopsis and rice and have been shown to be important regulatory components in many different biological processes. However, no systemic analysis of the bHLH transcription factor family has yet been reported in tomatoes. Tomato yellow leaf curl virus (TYLCV) threatens tomato production worldwide by causing leaf yellowing, leaf curling, plant stunting and flower abscission. A total of 152 bHLH transcription factors were identified from the entire tomato genome. Phylogenetic analysis of bHLH domain sequences from Arabidopsis and tomato facilitated classification of these genes into 26 subfamilies. The evolutionary and possible functional relationships revealed during this analysis are supported by other criteria, including the chromosomal distribution of these genes, the conservation of motifs and exon/intron structural patterns, and the predicted DNA binding activities within subfamilies. Distribution mapping results showed bHLH genes were localized on the 12 tomato chromosomes. Among the 152 bHLH genes from the tomato genome, 96 bHLH genes were detected in the TYLCV-susceptible and resistant tomato breeding line before (0 dpi) and after TYLCV (357 dpi) infection. As anticipated, gene ontology (GO) analysis indicated that most bHLH genes are related to the regulation of macromolecule metabolic processes and gene expression. Only four bHLH genes were differentially expressed between 0 and 357 dpi. Virus-induced gene silencing (VIGS) of one bHLH genes SlybHLH131 in resistant lines can lead to the cell death. In the present study, 152 bHLH transcription factor genes were identified. One of which bHLH genes, SlybHLH131, was found to be involved in the TYLCV infection through qRT-PCR expression analysis and VIGS validation. The isolation and identification of these bHLH transcription
Full Text Available Abstract Background The basic helix-loop-helix (bHLH transcription factors and their homologs form a superfamily that plays essential roles in transcriptional networks of multiple developmental processes. bHLH family members have been identified in over 20 organisms, including fruit fly, zebrafish, human and mouse. Result In this study, we conducted a genome-wide survey for bHLH sequences, and identified 57 bHLH sequences encoded in complete genome sequence of the ponerine ant, Harpegnathos saltator. Phylogenetic analysis of the bHLH domain sequences classified these genes into 38 bHLH families with 23, 14, 10, 1, 8 and 1 members in group A, B, C, D, E and F, respectively. The number of PabHLHs (ponerine ant bHLHs with introns is higher than many other insect species, and they are found to have introns with average lengths only inferior to those of pea aphid. In addition, two H. saltator bHLHs named PaCrp1 and PaSide locate on two separate contigs in the genome. Conclusions A putative full set of PabHLH genes is comparable with other insect species and genes encoding Oligo, MyoRb and Figα were not found in genomes of all insect species of which bHLH family members have been identified. Moreover, in-family phylogenetic analyses indicate that the PabHLH genes are more closely related with Apis mellifera than others. The present study will serve as a solid foundation for further investigations into the structure and function of bHLH proteins in the regulation of H. saltator development.
Prync, A E Sterin; Yankilevich, P; Barrero, P R; Bello, R; Marangunich, L; Vidal, A; Criscuolo, M; Benasayag, L; Famulari, A L; Domínguez, R O; Kauffman, M A; Diez, R A
Recombinant human interferon-beta (IFN-b) is a well-established treatment for multiple sclerosis (MS). The regulatory process for marketing authorization of biosimilars is currently under debate in certain countries. In the EU, EMEA has clearly defined the process including overarching and product-specific guidelines, which includes clinical testing. Biosimilarity needs to be based on comparability criteria, including at least molecular characterization, biological activity relevant for the therapeutic effect and relative bioavailability ("bioequivalence"). In the case of such complex diseases as MS, where the effect of treatment is not so directly measurable, in vitro tools can provide additional data to support comparability. Genomic microarrays assays might be useful to compare multisource biopharmaceuticals. The aim of the present study was to compare the pharmacodynamic genomic effects (in terms of transcriptional regulation) of two recombinant human IFN-I(2)1a preparations on lymphocytes of multiple sclerosis patients using a whole genome microarray assay. We performed an ex vivo whole genome expression profiling of the effect of two preparations of IFN-I(2)1a on non-adherent mononuclears from five relapsing-remitting MS patients analyzing microarrays (CodeLink Human Whole Genome). Patients blood was drawn, PBMCs isolated and cultured in three different conditions: culture medium (control), 1,000 U/ml of IFN-I(2)1a (BLA- (STOFERON, Bio Sidus) and 1,000 U/ml of IFN-I(2)1a (REBIF, Serono) RNA was purified from non-adherent cells (mostly lymphocytes), amplified and hybridized. Raw data were generated by CodeLink proprietary software. Data normalization, quality control and analysis of differential gene expression between treatments were done using linear model for microarray data. Functional annotation analysis of IFN-I(2)1a MS treatment transcription was done using DAVID. Out of the approximately 45,000 human sequences examined, no evidence of differential
Duan, Wei; Xu, Hongguo; Liu, Guotian; Fan, Peige; Liang, Zhenchang; Li, Shaohua
Prunus persica fruits were removed from 1-year-old shoots to analysis photosynthesis, chlorophyll fluorescence and genes changes in leaves to low sink demand caused by fruit removal (-fruit) during the final stage of rapid fruit growth. A decline in net photosynthesis rate was observed, accompanied with a decrease in stomatal conductance. The intercellular CO2 concentrations and leaf temperature increased as compared with a normal fruit load (+fruit). Moreover, low sink demand significantly inhibited the donor side and the reaction center of photosystem II. 382 genes in leaf with an absolute fold change ≥1 change in expression level, representing 116 up- and 266 down-regulated genes except for unknown transcripts. Among these, 25 genes for photosynthesis were down-regulated, 69 stress and 19 redox related genes up-regulated under the low sink demand. These studies revealed high leaf temperature may result in a decline of net photosynthesis rate through down-regulation in photosynthetic related genes and up-regulation in redox and stress related genes, especially heat shock proteins genes. The complex changes in genes at the transcriptional level under low sink demand provided useful starting points for in-depth analyses of source-sink relationship in P. persica.
Shen, Yaou; Zhang, Yongzhong; Chen, Jie; Lin, Haijian; Zhao, Maojun; Peng, Huanwei; Liu, Li; Yuan, Guangsheng; Zhang, Suzhi; Zhang, Zhiming; Pan, Guangtang
Lead (Pb) has become one of the most abundant heavy metal pollutants of the environment. With its large biomass, maize could be an important object for studying the phytoremediation of Pb-contaminated soil. In our previous research, we screened 19 inbred lines of maize for Pb concentration, and line 178 was identified to be a hyperaccumulator for Pb in both the roots and aboveground parts. To identify important genes and metabolic pathways related to Pb accumulation and tolerance, line 178 was underwent genome expression profile under Pb stress and a control (CK). A total of approximately 11 million cDNA tags were sequenced and 4 665 539 and 4 936 038 clean tags were obtained from the libraries of the test and CK, respectively. In comparison to CK, 2379 and 1832 genes were identified up- or downregulated, respectively, more than fivefolds under Pb stress. Interestingly, all the genes were related to cellular processes and signaling, information storage and processing or metabolism functions. Particularly, the genes involved in posttranslational modification, protein turnover and chaperones; signal transduction, carbohydrate transport and metabolism; and lipid transport and metabolism significantly changed under the treatment. In addition, seven pathways including ribosome, photosynthesis, and carbon fixation were affected significantly, with 118, 12, 34, 21, 18, 72 and 43 differentially expressed genes involved. The significant upregulation of the ribosome pathway may reveal an important secret for Pb tolerance of line 178. And the sharp increase of laccase transcripts and metal ion transporters were suggested to account in part for Pb hyperaccumulation in the line. Copyright © Physiologia Plantarum 2012.
Hu, Xiao-Mei; Shi, Cai-Yun; Liu, Xiao; Jin, Long-Fei; Liu, Yong-Zhong; Peng, Shu-Ang
ATP-citrate lyase (ACL, EC220.127.116.11) catalyzes citrate to oxaloacetate and acetyl-CoA in the cell cytosol, and has important roles in normal plant growth and in the biosynthesis of some secondary metabolites. We identified three ACL genes, CitACLα1, CitACLα2, and CitACLβ1, in the citrus genome database. Both CitACLα1 and CitACLα2 encode putative ACL α subunits with 82.5 % amino acid identity, whereas CitACLβ1 encodes a putative ACL β subunit. Gene structure analysis showed that CitACLα1 and CitACLα2 had 12 exons and 11 introns, and CitACLβ1 had 16 exons and 15 introns. CitACLα1 and CitACLβ1 were predominantly expressed in flower, and CitACLα2 was predominantly expressed in stem and fibrous roots. As fruits ripen, the transcript levels of CitACLα1, CitACLβ1, and/or CitACLα2 in cultivars 'Niuher' and 'Owari' increased, accompanied by significant decreases in citrate content, while their transcript levels decreased significantly in 'Egan No. 1' and 'Iyokan', although citrate content also decreased. In 'HB pummelo', in which acid content increased as fruit ripened, and in acid-free pummelo, transcript levels of CitACLα2, CitACLβ1, and/or CitACLα1 increased. Moreover, mild drought stress and ABA treatment significantly increased citrate contents in fruits. Transcript levels of the three genes were significantly reduced by mild drought stress, and the transcript level of only CitACLβ1 was significantly reduced by ABA treatment. Taken together, these data indicate that the effects of ACL on citrate use during fruit ripening depends on the cultivar, and the reduction in ACL gene expression may be attributed to citrate increases under mild drought stress or ABA treatment.
Pillai, Smitha; Chellappan, Srikumar P
Deregulation of transcriptional activity of many genes has been causatively linked to human diseases including cancer. Altered patterns of gene expression in normal and cancer cells are the result of inappropriate expression of transcription factors and chromatin modifying proteins. Chromatin immunoprecipitation assay is a well-established tool for investigating the interactions between regulatory proteins and DNA at distinct stages of gene activation. ChIP coupled with DNA microarrays, known as ChIP on chip, or sequencing of DNA associated with the factors (ChIP-Seq) allow us to determine the entire spectrum of in vivo DNA binding sites for a given protein. This has been of immense value because ChIP on chip assays and ChIP-Seq experiments can provide a snapshot of the transcriptional regulatory mechanisms on a genome-wide scale. This chapter outlines the general strategies used to carry out ChIP-chip assays to study the differential recruitment of regulatory molecules based on the studies conducted in our lab as well as other published protocols; these can be easily modified to a ChIP-Seq analysis.
Harrison, Melissa M; Eisen, Michael B
During the first stages of metazoan development, the genomes of the highly specified sperm and egg must unite and be reprogrammed to allow for the generation of a new organism. This process is controlled by maternally deposited products. Initially, the zygotic genome is largely transcriptionally quiescent, and it is not until hours later that the zygotic genome takes control of development. The transcriptional activation of the zygotic genome is tightly coordinated with the degradation of the maternal products. Here, we review the current understanding of the processes that mediate the reprogramming of the early embryonic genome and facilitate transcriptional activation during the early stages of Drosophila development.
Kang, Won-Hee; Kim, Seungill; Lee, Hyun-Ah; Choi, Doil; Yeom, Seon-In
The DNA-binding with one zinc finger proteins (Dofs) are a plant-specific family of transcription factors. The Dofs are involved in a variety of biological processes such as phytohormone production, seed development, and environmental adaptation. Dofs have been previously identified in several plants, but not in pepper. We identified 33 putative Dof genes in pepper (CaDofs). To gain an overview of the CaDofs, we analyzed phylogenetic relationships, protein motifs, and evolutionary history. We divided the 33 CaDofs, containing 25 motifs, into four major groups distributed on eight chromosomes. We discovered an expansion of the CaDofs dated to a recent duplication event. Segmental duplication that occurred before the speciation of the Solanaceae lineages was predominant among the CaDofs. The global gene-expression profiling of the CaDofs by RNA-seq analysis showed distinct temporal and pathogen-specific variation during development and response to biotic stresses (two TMV strains, PepMoV, and Phytophthora capsici), suggesting functional diversity among the CaDofs. These results will provide the useful clues into the responses of Dofs in biotic stresses and promote a better understanding of their multiple function in pepper and other species. PMID:27653666
Wu, Huili; Lv, Hao; Li, Long; Liu, Jun; Mu, Shaohua; Li, Xueping; Gao, Jian
The AP2/ERF transcription factor family, one of the largest families unique to plants, performs a significant role in terms of regulation of growth and development, and responses to biotic and abiotic stresses. Moso bamboo (Phyllostachys edulis) is a fast-growing non-timber forest species with the highest ecological, economic and social values of all bamboos in Asia. The draft genome of moso bamboo and the available genomes of other plants provide great opportunities to research global information on the AP2/ERF family in moso bamboo. In total, 116 AP2/ERF transcription factors were identified in moso bamboo. The phylogeny analyses indicated that the 116 AP2/ERF genes could be divided into three subfamilies: AP2, RAV and ERF; and the ERF subfamily genes were divided into 11 groups. The gene structures, exons/introns and conserved motifs of the PeAP2/ERF genes were analyzed. Analysis of the evolutionary patterns and divergence showed the PeAP2/ERF genes underwent a large-scale event around 15 million years ago (MYA) and the division time of AP2/ERF family genes between rice and moso bamboo was 15-23 MYA. We surveyed the putative promoter regions of the PeDREBs and showed that largely stress-related cis-elements existed in these genes. Further analysis of expression patterns of PeDREBs revealed that the most were strongly induced by drought, low-temperature and/or high salinity stresses in roots and, in contrast, most PeDREB genes had negative functions in leaves under the same respective stresses. In this study there were two main interesting points: there were fewer members of the PeDREB subfamily in moso bamboo than in other plants and there were differences in DREB gene expression profiles between leaves and roots triggered in response to abiotic stress. The information produced from this study may be valuable in overcoming challenges in cultivating moso bamboo.
Full Text Available Identifying transcription factors (TF involved in producing a genome-wide transcriptional profile is an essential step in building mechanistic model that can explain observed gene expression data. We developed a statistical framework for constructing genome-wide signatures of TF activity, and for using such signatures in the analysis of gene expression data produced by complex transcriptional regulatory programs. Our framework integrates ChIP-seq data and appropriately matched gene expression profiles to identify True REGulatory (TREG TF-gene interactions. It provides genome-wide quantification of the likelihood of regulatory TF-gene interaction that can be used to either identify regulated genes, or as genome-wide signature of TF activity. To effectively use ChIP-seq data, we introduce a novel statistical model that integrates information from all binding "peaks" within 2 Mb window around a gene's transcription start site (TSS, and provides gene-level binding scores and probabilities of regulatory interaction. In the second step we integrate these binding scores and regulatory probabilities with gene expression data to assess the likelihood of True REGulatory (TREG TF-gene interactions. We demonstrate the advantages of TREG framework in identifying genes regulated by two TFs with widely different distribution of functional binding events (ERα and E2f1. We also show that TREG signatures of TF activity vastly improve our ability to detect involvement of ERα in producing complex diseases-related transcriptional profiles. Through a large study of disease-related transcriptional signatures and transcriptional signatures of drug activity, we demonstrate that increase in statistical power associated with the use of TREG signatures makes the crucial difference in identifying key targets for treatment, and drugs to use for treatment. All methods are implemented in an open-source R package treg. The package also contains all data used in the analysis
Washio, T; Sasayama, J; Tomita, M
Free energy values of mRNA tertiary structures around stop codons were systematically calculated to surmise the hairpin-forming potential for all genes in each of the 16 complete prokaryote genomes...
Arefin, Badrul; Kucerova, Lucie; Dobes, Pavel; Markus, Robert; Strnad, Hynek; Wang, Zhi; Hyrsl, Pavel; Zurovec, Michal; Theopold, Ulrich
Heterorhabditis bacteriophora is an entomopathogenic nematode (EPN) which infects its host by accessing the hemolymph where it releases endosymbiotic bacteria of the species Photorhabdus luminescens. We performed a genome-wide transcriptional analysis of the Drosophila response to EPN infection at the time point at which the nematodes reached the hemolymph either via the cuticle or the gut and the bacteria had started to multiply. Many of the most strongly induced genes have been implicated in immune responses in other infection models. Mapping of the complete set of differentially regulated genes showed the hallmarks of a wound response, but also identified a large fraction of EPN-specific transcripts. Several genes identified by transcriptome profiling or their homologues play protective roles during nematode infections. Genes that positively contribute to controlling nematobacterial infections encode: a homolog of thioester-containing complement protein 3, a basement membrane component (glutactin), a recognition protein (GNBP-like 3) and possibly several small peptides. Of note is that several of these genes have not previously been implicated in immune responses.
StuA, first discovered in Aspergillus nidulans and a member of the APSES class of transcription factors, regulates several essential developmental stages in fungi such as virulence, sporulation and toxin production in phytopathogenic fungi. Fusarium verticillioides (Fv), a maize phytopathogen, produ...
Full Text Available Transposable elements (TEs are exceptional contributors to eukaryotic genome diversity. Their ubiquitous presence impacts the genomes of nearly all species and mediates genome evolution by causing mutations and chromosomal rearrangements and by modulating gene expression. We performed an exhaustive analysis of the TE content in 18 fungal genomes, including strains of the same species and species of the same genera. Our results depicted a scenario of exceptional variability, with species having 0.02 to 29.8% of their genome consisting of transposable elements. A detailed analysis performed on two strains of Pleurotus ostreatus uncovered a genome that is populated mainly by Class I elements, especially LTR-retrotransposons amplified in recent bursts from 0 to 2 million years (My ago. The preferential accumulation of TEs in clusters led to the presence of genomic regions that lacked intra- and inter-specific conservation. In addition, we investigated the effect of TE insertions on the expression of their nearby upstream and downstream genes. Our results showed that an important number of genes under TE influence are significantly repressed, with stronger repression when genes are localized within transposon clusters. Our transcriptional analysis performed in four additional fungal models revealed that this TE-mediated silencing was present only in species with active cytosine methylation machinery. We hypothesize that this phenomenon is related to epigenetic defense mechanisms that are aimed to suppress TE expression and control their proliferation.
Castanera, Raúl; López-Varas, Leticia; Borgognone, Alessandra; LaButti, Kurt; Lapidus, Alla; Schmutz, Jeremy; Grimwood, Jane; Pisabarro, Antonio G.; Grigoriev, Igor V.; Ramírez, Lucía
Transposable elements (TEs) are exceptional contributors to eukaryotic genome diversity. Their ubiquitous presence impacts the genomes of nearly all species and mediates genome evolution by causing mutations and chromosomal rearrangements and by modulating gene expression. We performed an exhaustive analysis of the TE content in 18 fungal genomes, including strains of the same species and species of the same genera. Our results depicted a scenario of exceptional variability, with species having 0.02 to 29.8% of their genome consisting of transposable elements. A detailed analysis performed on two strains of Pleurotus ostreatus uncovered a genome that is populated mainly by Class I elements, especially LTR-retrotransposons amplified in recent bursts from 0 to 2 million years (My) ago. The preferential accumulation of TEs in clusters led to the presence of genomic regions that lacked intra- and inter-specific conservation. In addition, we investigated the effect of TE insertions on the expression of their nearby upstream and downstream genes. Our results showed that an important number of genes under TE influence are significantly repressed, with stronger repression when genes are localized within transposon clusters. Our transcriptional analysis performed in four additional fungal models revealed that this TE-mediated silencing was present only in species with active cytosine methylation machinery. We hypothesize that this phenomenon is related to epigenetic defense mechanisms that are aimed to suppress TE expression and control their proliferation. PMID:27294409
Full Text Available Heregulin beta-1 (HRG is an extracellular ligand that activates mitogen-activated protein kinase (MAPK and phosphatidylinositol-3-OH kinase (PI3K/Akt signaling pathways through ErbB receptors. MAPK and Akt have been shown to phosphorylate the estrogen receptor (ER at Ser-118 and Ser-167, respectively, thereby mimicking the effects of estrogenic activity such as estrogen responsive element (ERE-dependent transcription. In the current study, integrative analysis was performed using two tiling array platforms, comprising histone H3 lysine 9 (H3K9 acetylation and RNA mapping, together with array comparative genomic hybridization (CGH analysis in an effort to identify HRG-regulated genes in ER-positive MCF-7 breast cancer cells. Through application of various threshold settings, 333 (326 up-regulated and 7 down-regulated HRG-regulated genes were detected. Prediction of upstream transcription factors (TFs and pathway analysis indicated that 21% of HRG-induced gene regulation may be controlled by the MAPK cascade, while only 0.6% of the gene expression is controlled by ERE. A comparison with previously reported estrogen (E2-regulated gene expression data revealed that only 12 common genes were identified between the 333 HRG-regulated (3.6% and 239 E2-regulated (5.0% gene groups. However, with respect to enriched upstream TFs, 4 common TFs were identified in the 14 HRG-regulated (28.6% and 13 E2-regulated (30.8% gene groups. These results indicated that while E2 and HRG may induce common TFs, the regulatory mechanisms that govern HRG- and E2-induced gene expression differ.
Zhou, Y; Wu, X X; Zhang, Z; Gao, Z H
Flower color is an important trait of the ornamental peach (Prunus persica L.). However, the mechanism responsible for the different colors that appear in the same genotype remains unclear. In this study, red samples showed higher anthocyanins content (0.122 ± 0.009 mg/g), which was significantly different from that in white samples (0.066 ± 0.010 mg/g). Similarly to carotenoids content, red extract (0.058 ± 0.004 mg/L) was significantly higher in white extract (0.015 ± 0.004 mg/L). We estimated gene expression using Illumina sequencing technology in libraries from white and red flower buds. A total of 3,599,960 and 3,464,141 tags were sequenced from the 2 libraries, respectively. Moreover, we identified 106 significantly differentially expressed genes between the 2 libraries. Among these, 78 and 28 represented transcripts with a higher or lower abundance of more than 2-fold than in the white flower library, respectively. GO annotation indicated that highly ranked genes were involved in the pigment biosynthetic process. Expression patterns of 11 genes were verified using quantitative reverse transcription-polymerase chain reaction assays. The results suggest that hydroxycinnamoyl-coenzyme A shikimate/quinate hydroxycinnamoyltransferase, 2-oxoglutarate-dependent dioxygenase, isoflavone reductase, riboflavin kinase, zeta-carotene desaturase, and ATP binding cassette transporter may be associated with the flower color formation. Our results may be useful for scientists focusing on Prunus persica floral development and biotechnology.
Li, Long; Liu, Jun; Mu, Shaohua; Li, Xueping; Gao, Jian
The AP2/ERF transcription factor family, one of the largest families unique to plants, performs a significant role in terms of regulation of growth and development, and responses to biotic and abiotic stresses. Moso bamboo (Phyllostachys edulis) is a fast-growing non-timber forest species with the highest ecological, economic and social values of all bamboos in Asia. The draft genome of moso bamboo and the available genomes of other plants provide great opportunities to research global information on the AP2/ERF family in moso bamboo. In total, 116 AP2/ERF transcription factors were identified in moso bamboo. The phylogeny analyses indicated that the 116 AP2/ERF genes could be divided into three subfamilies: AP2, RAV and ERF; and the ERF subfamily genes were divided into 11 groups. The gene structures, exons/introns and conserved motifs of the PeAP2/ERF genes were analyzed. Analysis of the evolutionary patterns and divergence showed the PeAP2/ERF genes underwent a large-scale event around 15 million years ago (MYA) and the division time of AP2/ERF family genes between rice and moso bamboo was 15–23 MYA. We surveyed the putative promoter regions of the PeDREBs and showed that largely stress-related cis-elements existed in these genes. Further analysis of expression patterns of PeDREBs revealed that the most were strongly induced by drought, low-temperature and/or high salinity stresses in roots and, in contrast, most PeDREB genes had negative functions in leaves under the same respective stresses. In this study there were two main interesting points: there were fewer members of the PeDREB subfamily in moso bamboo than in other plants and there were differences in DREB gene expression profiles between leaves and roots triggered in response to abiotic stress. The information produced from this study may be valuable in overcoming challenges in cultivating moso bamboo. PMID:25985202
Lu, Chengrong; Xiong, Min; Luo, Yuan; Li, Jing; Zhang, Yanjun; Dong, Yaqiong; Zhu, Yanjun; Niu, Tianhui; Wang, Zhe; Duan, Lianning
Histone H2AX is a novel tumor suppressor protein and plays an important role in apoptosis of cancer cells. However, the role of H2AX in lung cancer cells is unclear. The detailed mechanism and epigenetic regulation by H2AX remain elusive in cancer cells. We showed that H2AX was involved in apoptosis of lung cancer A549 cells as in other tumor cells. Knockdown of H2AX strongly suppressed apoptosis of A549 cells. We clarified the molecular mechanisms of apoptosis regulated by H2AX based on genome-wide transcriptional analysis. Microarray data analysis demonstrated that H2AX knockdown in A549 cells affected expression of 3,461 genes, including upregulation of 1,435 and downregulation of 2,026. These differentially expressed genes were subjected to bioinformatic analysis for exploring biological processes regulated by H2AX in lung cancer cells. Gene ontology analysis showed that H2AX affected expression of many genes, through which, many important functions including response to stimuli, gene expression, and apoptosis were involved in apoptotic regulation of lung cancer cells. Pathway analysis identified the mitogen-activated protein kinase signaling pathway and apoptosis as the most important pathways targeted by H2AX. Signal transduction pathway networks analysis and chromatin immunoprecipitation assay showed that two core genes, NFKB1 and JUN, were involved in apoptosis regulated by H2AX in lung cancer cells. Taken together, these data provide compelling clues for further exploration of H2AX function in cancer cells.
Hu, Jinchuan; Adar, Sheera; Selby, Christopher P; Lieb, Jason D; Sancar, Aziz
We developed a method for genome-wide mapping of DNA excision repair named XR-seq (excision repair sequencing). Human nucleotide excision repair generates two incisions surrounding the site of damage, creating an ∼30-mer. In XR-seq, this fragment is isolated and subjected to high-throughput sequencing. We used XR-seq to produce stranded, nucleotide-resolution maps of repair of two UV-induced DNA damages in human cells: cyclobutane pyrimidine dimers (CPDs) and (6-4) pyrimidine-pyrimidone photoproducts [(6-4)PPs]. In wild-type cells, CPD repair was highly associated with transcription, specifically with the template strand. Experiments in cells defective in either transcription-coupled excision repair or general excision repair isolated the contribution of each pathway to the overall repair pattern and showed that transcription-coupled repair of both photoproducts occurs exclusively on the template strand. XR-seq maps capture transcription-coupled repair at sites of divergent gene promoters and bidirectional enhancer RNA (eRNA) production at enhancers. XR-seq data also uncovered the repair characteristics and novel sequence preferences of CPDs and (6-4)PPs. XR-seq and the resulting repair maps will facilitate studies of the effects of genomic location, chromatin context, transcription, and replication on DNA repair in human cells.
Ding, Haiping; Qin, Cheng; Luo, Xirong; Li, Lujiang; Chen, Zhe; Liu, Hongjun; Gao, Jian; Lin, Haijian; Shen, Yaou; Zhao, Maojun; Lübberstedt, Thomas; Zhang, Zhiming; Pan, Guangtang
Heterosis, or hybrid vigor, contributes to superior agronomic performance of hybrids compared to their inbred parents. Despite its importance, little is known about the genetic and molecular basis of heterosis. Early maize ear inflorescences formation affects grain yield, and are thus an excellent model for molecular mechanisms involved in heterosis. To determine the parental contributions and their regulation during maize ear-development-genesis, we analyzed genome-wide digital gene expression profiles in two maize elite inbred lines (B73 and Mo17) and their F1 hybrid using deep sequencing technology. Our analysis revealed 17,128 genes expressed in these three genotypes and 22,789 genes expressed collectively in the present study. Approximately 38% of the genes were differentially expressed in early maize ear inflorescences from heterotic cross, including many transcription factor genes and some presence/absence variations (PAVs) genes, and exhibited multiple modes of gene action. These different genes showing differential expression patterns were mainly enriched in five cellular component categories (organelle, cell, cell part, organelle part and macromolecular complex), five molecular function categories (structural molecule activity, binding, transporter activity, nucleic acid binding transcription factor activity and catalytic activity), and eight biological process categories (cellular process, metabolic process, biological regulation, regulation of biological process, establishment of localization, cellular component organization or biogenesis, response to stimulus and localization). Additionally, a significant number of genes were expressed in only one inbred line or absent in both inbred lines. Comparison of the differences of modes of gene action between previous studies and the present study revealed only a small number of different genes had the same modes of gene action in both maize seedlings and ear inflorescences. This might be an indication that in
Full Text Available Heterosis, or hybrid vigor, contributes to superior agronomic performance of hybrids compared to their inbred parents. Despite its importance, little is known about the genetic and molecular basis of heterosis. Early maize ear inflorescences formation affects grain yield, and are thus an excellent model for molecular mechanisms involved in heterosis. To determine the parental contributions and their regulation during maize ear-development-genesis, we analyzed genome-wide digital gene expression profiles in two maize elite inbred lines (B73 and Mo17 and their F1 hybrid using deep sequencing technology. Our analysis revealed 17,128 genes expressed in these three genotypes and 22,789 genes expressed collectively in the present study. Approximately 38% of the genes were differentially expressed in early maize ear inflorescences from heterotic cross, including many transcription factor genes and some presence/absence variations (PAVs genes, and exhibited multiple modes of gene action. These different genes showing differential expression patterns were mainly enriched in five cellular component categories (organelle, cell, cell part, organelle part and macromolecular complex, five molecular function categories (structural molecule activity, binding, transporter activity, nucleic acid binding transcription factor activity and catalytic activity, and eight biological process categories (cellular process, metabolic process, biological regulation, regulation of biological process, establishment of localization, cellular component organization or biogenesis, response to stimulus and localization. Additionally, a significant number of genes were expressed in only one inbred line or absent in both inbred lines. Comparison of the differences of modes of gene action between previous studies and the present study revealed only a small number of different genes had the same modes of gene action in both maize seedlings and ear inflorescences. This might be an
Su, Hongyan; Zhang, Shizhong; Yuan, Xiaowei; Chen, Changtian; Wang, Xiao-Fei; Hao, Yu-Jin
NAC (NAM, ATAF1,2, and CUC2) proteins constitute one of the largest families of plant-specific transcription factors. To date, little is known about the NAC genes in the apple (Malus domestica). In this study, a total of 180 NAC genes were identified in the apple genome and were phylogenetically clustered into six groups (I-VI) with the NAC genes from Arabidopsis and rice. The predicted apple NAC genes were distributed across all of 17 chromosomes at various densities. Additionally, the gene structure and motif compositions of the apple NAC genes were analyzed. Moreover, the expression of 29 selected apple NAC genes was analyzed in different tissues and under different abiotic stress conditions. All of the selected genes, with the exception of four genes, were expressed in at least one of the tissues tested, which indicates that the NAC genes are involved in various aspects of the physiological and developmental processes of the apple. Encouragingly, 17 of the selected genes were found to respond to one or more of the abiotic stress treatments, and these 17 genes included not only the expected 7 genes that were clustered with the well-known stress-related marker genes in group IV but also 10 genes located in other subgroups, none of which contains members that have been reported to be stress-related. To the best of our knowledge, this report describes the first genome-wide analysis of the apple NAC gene family, and the results should provide valuable information for understanding the classification and putative functions of this family.
Zhao, Mei-Wei; Duan, Cheng-Li; Liu, Jiang
Systematic reverse-engineering of functional genome architecture requires precise modifications of gene sequences and transcription levels. The development and application of transcription activator-like effectors(TALEs) has created a wealth of genome engineering possibilities. TALEs are a class of naturally occurring DNA-binding proteins found in the plant pathogen Xanthomonas species. The DNA-binding domain of each TALE typically consists of tandem 34-amino acid repeat modules rearranged according to a simple cipher to target new DNA sequences. Customized TALEs can be used for a wide variety of genome engineering applications, including transcriptional modulation and genome editing. Such "genome engineering" has now been established in human cells and a number of model organisms, thus opening the door to better understanding gene function in model organisms, improving traits in crop plants and treating human genetic disorders.
Okazaki, Masayuki; Kazama, Tomohiko; Murata, Hayato; Motomura, Keiji; Toriyama, Kinya
Cytoplasmic male sterility (CMS) is a maternally inherited trait in which plants fail to produce functional pollen and is associated with the expression of a novel open reading frame (orf) gene encoded by the mitochondrial genome. An RT102A CMS line and an RT102C fertility restorer line were obtained by successive backcrossing between Oryza rufipogon W1125 and O. sativa Taichung 65. Using next-generation pyrosequencing, we determined whole-genome sequences of the mitochondria in RT102-CMS cytoplasm. To identify candidates for the CMS-associated gene in RT102 mitochondria, we screened the mitochondrial genome for the presence of specific orf genes that were chimeric or whose products carried predicted transmembrane domains. One of these orf genes, orf352, which showed different transcript sizes depending on whether the restorer of fertility (Rf) gene was present or not, was identified. The orf352 gene was co-transcribed with the ribosomal protein gene rpl5, and the 2.8 kb rpl5-orf352 transcripts were processed into 2.6 kb transcripts with a cleavage at the inside of the orf352 coding region in the presence of the Rf gene. The orf352 gene is an excellent candidate for the CMS-associated gene for RT102-CMS.
J.H. Brandsma (Johan)
markdownabstractNearly all cells of an individual organism contain the same genome. However, each cell type transcribes a different set of genes due to the presence of different sets of cell type-specific transcription factors. Such transcription factors bind to regulatory regions such as promoters
Rasmussen, Simon; Nielsen, Henrik Bjørn; Jarmer, Hanne Østergaard
The majority of all genes have so far been identified and annotated systematically through in silico gene finding. Here we report the finding of 3662 strand-specific transcriptionally active regions (TARs) in the genome of Bacillus subtilis by the use of tiling arrays. We have measured the genome...
Full Text Available Abstract Background Human natural killer (NK cells are the key contributors of innate immune response and the effector functions of these cells are enhanced by cytokines such as interleukine 2 (IL2. We utilized genome-wide transcriptional profiling to identify gene expression signatures and pathways in resting and IL2 activated NK cell isolated from peripheral blood of healthy donors. Results Gene expression profiling of resting NK cells showed high expression of a number of cytotoxic factors, cytokines, chemokines and inhibitory and activating surface NK receptors. Resting NK cells expressed many genes associated with cellular quiescence and also appeared to have an active TGFβ (TGFB1 signaling pathway. IL2 stimulation induced rapid downregulation of quiescence associated genes and upregulation of genes associated with cell cycle progression and proliferation. Numerous genes that may enhance immune function and responsiveness including activating receptors (DNAM1, KLRC1 and KLRC3, death receptor ligand (TNFSF6 (FASL and TRAIL, chemokine receptors (CX3CR1, CCR5 and CCR7, interleukin receptors (IL2RG, IL18RAB and IL27RA and members of secretory pathways (DEGS1, FKBP11, SSR3, SEC61G and SLC3A2 were upregulated. The expression profile suggested PI3K/AKT activation and NF-κB activation through multiple pathways (TLR/IL1R, TNF receptor induced and TCR-like possibly involving BCL10. Activation of NFAT signaling was supported by increased expression of many pathway members and downstream target genes. The transcription factor GATA3 was expressed in resting cells while T-BET was upregulated on activation concurrent with the change in cytokine expression profile. The importance of NK cells in innate immune response was also reflected by late increased expression of inflammatory chemotactic factors and receptors and molecules involved in adhesion and lymphocyte trafficking or migration. Conclusion This analysis allowed us to identify genes implicated in
Hu, Qing; Guo, Wei; Li, Dapeng
Background The natural sex reversal severely affects the sex ratio and thus decreases the productivity of the rice field eel (Monopterus albus). How to understand and manipulate this process is one of the major issues for the rice field eel stocking. So far the genomics and transcriptomics data available for this species are still scarce. Here we provide a comprehensive study of transcriptomes of brain and gonad tissue in three sex stages (female, intersex and male) from the rice field eel to investigate changes in transcriptional level during the sex reversal process. Results Approximately 195 thousand unigenes were generated and over 44.4 thousand were functionally annotated. Comparative study between stages provided multiple differentially expressed genes in brain and gonad tissue. Overall 4668 genes were found to be of unequal abundance between gonad tissues, far more than that of the brain tissues (59 genes). These genes were enriched in several different signaling pathways. A number of 231 genes were found with different levels in gonad in each stage, with several reproduction-related genes included. A total of 19 candidate genes that could be most related to sex reversal were screened out, part of these genes’ expression patterns were validated by RT-qPCR. The expression of spef2, maats1, spag6 and dmc1 were abundant in testis, but was barely detected in females, while the 17β-hsd12, zpsbp3, gal3 and foxn5 were only expressed in ovary. Conclusion This study investigated the complexity of brain and gonad transcriptomes in three sex stages of the rice field eel. Integrated analysis of different gene expression and changes in signaling pathways, such as PI3K-Akt pathway, provided crucial data for further study of sex transformation mechanisms. PMID:28319194
Full Text Available Abstract Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR, numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp. Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S
Liu, Guodong; Marras, Antonio; Nielsen, Jens
regulatory information is necessary to improve the accuracy and predictive ability of metabolic models. Here we review the strategies for the reconstruction of a transcriptional regulatory network (TRN) for yeast and the integration of such a reconstruction into a flux balance analysis-based metabolic model......Metabolism is regulated at multiple levels in response to the changes of internal or external conditions. Transcriptional regulation plays an important role in regulating many metabolic reactions by altering the concentrations of metabolic enzymes. Thus, integration of the transcriptional...... transcriptional regulatory interactions to genome-scale metabolic models in a quantitative manner....
Full Text Available DNA Damage contributes to cancer development and ageing. Congenital syndromes that affect DNA repair processes are characterized by cancer susceptibility, developmental defects, and accelerated ageing (Schumacher et al., 2008. DNA damage interferes with DNA metabolism by blocking replication and transcription. DNA polymerase blockage leads to replication arrest and can gives rise to genome instability. Transcription, on the other hand, is an essential process for utilizing the information encoded in the genome. DNA damage that interferes with transcription can lead to apoptosis and cellular senescence. Both processes are powerful tumor suppressors (Bartek and Lukas, 2007. Cellular response mechanisms to stalled RNA polymerase (RNAP II complexes have only recently started to be uncovered. Transcription-coupled DNA damage responses might thus play important roles for the adjustments to DNA damage accumulation in the ageing organism (Garinis et al., 2009. Here we review human disorders that are caused by defects in genome stability to explore the role of DNA damage in ageing and disease. We discuss how the nucleotide excision repair (NER system functions at the interface of transcription and repair and conclude with concepts how therapeutic targeting of transcription might be utilized in the treatment of cancer.
Li, Ling-Hui; Li, Jian-Chiuan; Lin, Yung-Feng; Lin, Chung-Yen; Chen, Chung-Yung; Tsai, Shih-Feng
To facilitate transcript mapping and to investigate alterations in genomic structure and gene expression in a defined genomic target, we developed a novel microarray-based method to detect transcriptional activity of the human chromosome 4q22-24 region. Loss of heterozygosity of human 4q22-24 is frequently observed in hepatocellular carcinoma (HCC). One hundred and eighteen well-characterized genes have been identified from this region. We took previously sequenced shotgun subclones as templates to amplify overlapping sequences for the genomic segment and constructed a chromosome-region-specific microarray. Using genomic DNA fragments as probes, we detected transcriptional activity from within this region among five different tissues. The hybridization results indicate that there are new transcripts that have not yet been identified by other methods. The existence of new transcripts encoded by genes in this region was confirmed by PCR cloning or cDNA library screening. The procedure reported here allows coupling of shotgun sequencing with transcript mapping and, potentially, detailed analysis of gene expression and chromosomal copy of the genomic sequence for the putative HCC tumor suppressor gene(s) in the 4q candidate region.
O'Grady, Tina; Wang, Xia; Höner Zu Bentrup, Kerstin; Baddoo, Melody; Concha, Monica; Flemington, Erik K
Annotation of herpesvirus genomes has traditionally been undertaken through the detection of open reading frames and other genomic motifs, supplemented with sequencing of individual cDNAs. Second generation sequencing and high-density microarray studies have revealed vastly greater herpesvirus transcriptome complexity than is captured by existing annotation. The pervasive nature of overlapping transcription throughout herpesvirus genomes, however, poses substantial problems in resolving transcript structures using these methods alone. We present an approach that combines the unique attributes of Pacific Biosciences Iso-Seq long-read, Illumina short-read and deepCAGE (Cap Analysis of Gene Expression) sequencing to globally resolve polyadenylated isoform structures in replicating Epstein-Barr virus (EBV). Our method, Transcriptome Resolution through Integration of Multi-platform Data (TRIMD), identifies nearly 300 novel EBV transcripts, quadrupling the size of the annotated viral transcriptome. These findings illustrate an array of mechanisms through which EBV achieves functional diversity in its relatively small, compact genome including programmed alternative splicing (e.g. across the IR1 repeats), alternative promoter usage by LMP2 and other latency-associated transcripts, intergenic splicing at the BZLF2 locus, and antisense transcription and pervasive readthrough transcription throughout the genome.
Thomas Esquerré; Marie Bouvier; Catherine Turlan; Carpousis, Agamemnon J.; Laurence Girbal; Muriel Cocaign-Bousquet
Bacterial adaptation requires large-scale regulation of gene expression. We have performed a genome-wide analysis of the Csr system, which regulates many important cellular functions. The Csr system is involved in post-transcriptional regulation, but a role in transcriptional regulation has also been suggested. Two proteins, an RNA-binding protein CsrA and an atypical signaling protein CsrD, participate in the Csr system. Genome-wide transcript stabilities and levels were compared in wildtype...
Jensen Paul A
Full Text Available Abstract Background Several methods have been developed for analyzing genome-scale models of metabolism and transcriptional regulation. Many of these methods, such as Flux Balance Analysis, use constrained optimization to predict relationships between metabolic flux and the genes that encode and regulate enzyme activity. Recently, mixed integer programming has been used to encode these gene-protein-reaction (GPR relationships into a single optimization problem, but these techniques are often of limited generality and lack a tool for automating the conversion of rules to a coupled regulatory/metabolic model. Results We present TIGER, a Toolbox for Integrating Genome-scale Metabolism, Expression, and Regulation. TIGER converts a series of generalized, Boolean or multilevel rules into a set of mixed integer inequalities. The package also includes implementations of existing algorithms to integrate high-throughput expression data with genome-scale models of metabolism and transcriptional regulation. We demonstrate how TIGER automates the coupling of a genome-scale metabolic model with GPR logic and models of transcriptional regulation, thereby serving as a platform for algorithm development and large-scale metabolic analysis. Additionally, we demonstrate how TIGER's algorithms can be used to identify inconsistencies and improve existing models of transcriptional regulation with examples from the reconstructed transcriptional regulatory network of Saccharomyces cerevisiae. Conclusion The TIGER package provides a consistent platform for algorithm development and extending existing genome-scale metabolic models with regulatory networks and high-throughput data.
Galán-Vásquez, Edgardo; Sánchez-Osorio, Ismael; Martínez-Antonio, Agustino
The description of transcriptional regulatory networks has been pivotal in the understanding of operating principles under which organisms respond and adapt to varying conditions. While the study of the topology and dynamics of these networks has been the subject of considerable work, the investigation of the evolution of their topology, as a result of the adaptation of organisms to different environmental conditions, has received little attention. In this work, we study the evolution of transcriptional regulatory networks in bacteria from a genome reduction perspective, which manifests itself as the loss of genes at different degrees. We used the transcriptional regulatory network of Escherichia coli as a reference to compare 113 smaller, phylogenetically-related γ-proteobacteria, including 19 genomes of symbionts. We found that the type of regulatory action exerted by transcription factors, as genomes get progressively smaller, correlates well with their degree of conservation, with dual regulators being more conserved than repressors and activators in conditions of extreme reduction. In addition, we found that the preponderant conservation of dual regulators might be due to their role as both global regulators and nucleoid-associated proteins. We summarize our results in a conceptual model of how each TF type is gradually lost as genomes become smaller and give a rationale for the order in which this phenomenon occurs.
Full Text Available The description of transcriptional regulatory networks has been pivotal in the understanding of operating principles under which organisms respond and adapt to varying conditions. While the study of the topology and dynamics of these networks has been the subject of considerable work, the investigation of the evolution of their topology, as a result of the adaptation of organisms to different environmental conditions, has received little attention. In this work, we study the evolution of transcriptional regulatory networks in bacteria from a genome reduction perspective, which manifests itself as the loss of genes at different degrees. We used the transcriptional regulatory network of Escherichia coli as a reference to compare 113 smaller, phylogenetically-related γ-proteobacteria, including 19 genomes of symbionts. We found that the type of regulatory action exerted by transcription factors, as genomes get progressively smaller, correlates well with their degree of conservation, with dual regulators being more conserved than repressors and activators in conditions of extreme reduction. In addition, we found that the preponderant conservation of dual regulators might be due to their role as both global regulators and nucleoid-associated proteins. We summarize our results in a conceptual model of how each TF type is gradually lost as genomes become smaller and give a rationale for the order in which this phenomenon occurs.
Menouni, Rachid; Champ, Stéphanie; Espinosa, Leon; Boudvillain, Marc; Ansaldi, Mireille
Prophages represent a large fraction of prokaryotic genomes and often provide new functions to their hosts, in particular virulence and fitness. How prokaryotic cells maintain such gene providers is central for understanding bacterial genome evolution by horizontal transfer. Prophage excision occurs through site-specific recombination mediated by a prophage-encoded integrase. In addition, a recombination directionality factor (or excisionase) directs the reaction toward excision and prevents the phage genome from being reintegrated. In this work, we describe the role of the transcription termination factor Rho in prophage maintenance through control of the synthesis of transcripts that mediate recombination directionality factor expression and, thus, excisive recombination. We show that Rho inhibition by bicyclomycin allows for the expression of prophage genes that lead to excisive recombination. Thus, besides its role in the silencing of horizontally acquired genes, Rho also maintains lysogeny of defective and functional prophages.
Full Text Available BACKGROUND: Several lines of evidence suggest that transcription factors are involved in the pathogenesis of Multiple Sclerosis (MS but complete mapping of the whole network has been elusive. One of the reasons is that there are several clinical subtypes of MS and transcription factors that may be involved in one subtype may not be in others. We investigate the possibility that this network could be mapped using microarray technologies and contemporary bioinformatics methods on a dataset derived from whole blood in 99 untreated MS patients (36 Relapse Remitting MS, 43 Primary Progressive MS, and 20 Secondary Progressive MS and 45 age-matched healthy controls. METHODOLOGY/PRINCIPAL FINDINGS: We have used two different analytical methodologies: a non-standard differential expression analysis and a differential co-expression analysis, which have converged on a significant number of regulatory motifs that are statistically overrepresented in genes that are either differentially expressed (or differentially co-expressed in cases and controls (e.g., V$KROX_Q6, p-value <3.31E-6; V$CREBP1_Q2, p-value <9.93E-6, V$YY1_02, p-value <1.65E-5. CONCLUSIONS/SIGNIFICANCE: Our analysis uncovered a network of transcription factors that potentially dysregulate several genes in MS or one or more of its disease subtypes. The most significant transcription factor motifs were for the Early Growth Response EGR/KROX family, ATF2, YY1 (Yin and Yang 1, E2F-1/DP-1 and E2F-4/DP-2 heterodimers, SOX5, and CREB and ATF families. These transcription factors are involved in early T-lymphocyte specification and commitment as well as in oligodendrocyte dedifferentiation and development, both pathways that have significant biological plausibility in MS causation.
Grotkjær, Thomas; Nielsen, Jens
DNA microarray technology enables the simultaneous measurement of the transcript level of thousands of genes. Primary analysis can be done with basic statistical tools and cluster analysis, but effective and in depth analysis of the vast amount of transcription data requires integration with data...... of Saccharomyces cerevisiae whole genome transcription data. A special focus is on the quantitative aspects of normalisation and mathematical modelling approaches, since they are expected to play an increasing role in future DNA microarray analysis studies. Data analysis is exemplified with cluster analysis...
Lu, Zefu; Yu, Hong; Xiong, Guosheng; Wang, Jing; Jiao, Yongqing; Liu, Guifu; Jing, Yanhui; Meng, Xiangbing; Hu, Xingming; Qian, Qian; Fu, Xiangdong; Wang, Yonghong; Li, Jiayang
IDEAL PLANT ARCHITECTURE1 (IPA1) is critical in regulating rice (Oryza sativa) plant architecture and substantially enhances grain yield. To elucidate its molecular basis, we first confirmed IPA1 as a functional transcription activator and then identified 1067 and 2185 genes associated with IPA1 binding sites in shoot apices and young panicles, respectively, through chromatin immunoprecipitation sequencing assays. The SQUAMOSA PROMOTER BINDING PROTEIN-box direct binding core motif GTAC was highly enriched in IPA1 binding peaks; interestingly, a previously uncharacterized indirect binding motif TGGGCC/T was found to be significantly enriched through the interaction of IPA1 with proliferating cell nuclear antigen PROMOTER BINDING FACTOR1 or PROMOTER BINDING FACTOR2. Genome-wide expression profiling by RNA sequencing revealed IPA1 roles in diverse pathways. Moreover, our results demonstrated that IPA1 could directly bind to the promoter of rice TEOSINTE BRANCHED1, a negative regulator of tiller bud outgrowth, to suppress rice tillering, and directly and positively regulate DENSE AND ERECT PANICLE1, an important gene regulating panicle architecture, to influence plant height and panicle length. The elucidation of target genes of IPA1 genome-wide will contribute to understanding the molecular mechanisms underlying plant architecture and to facilitating the breeding of elite varieties with ideal plant architecture. PMID:24170127
Lu, Zefu; Yu, Hong; Xiong, Guosheng; Wang, Jing; Jiao, Yongqing; Liu, Guifu; Jing, Yanhui; Meng, Xiangbing; Hu, Xingming; Qian, Qian; Fu, Xiangdong; Wang, Yonghong; Li, Jiayang
Ideal plant architecture1 (IPA1) is critical in regulating rice (Oryza sativa) plant architecture and substantially enhances grain yield. To elucidate its molecular basis, we first confirmed IPA1 as a functional transcription activator and then identified 1067 and 2185 genes associated with IPA1 binding sites in shoot apices and young panicles, respectively, through chromatin immunoprecipitation sequencing assays. The Squamosa promoter binding protein-box direct binding core motif GTAC was highly enriched in IPA1 binding peaks; interestingly, a previously uncharacterized indirect binding motif TGGGCC/T was found to be significantly enriched through the interaction of IPA1 with proliferating cell nuclear antigen promoter binding factor1 or promoter binding factor2. Genome-wide expression profiling by RNA sequencing revealed IPA1 roles in diverse pathways. Moreover, our results demonstrated that IPA1 could directly bind to the promoter of rice teosinte branched1, a negative regulator of tiller bud outgrowth, to suppress rice tillering, and directly and positively regulate dense and erect panicle1, an important gene regulating panicle architecture, to influence plant height and panicle length. The elucidation of target genes of IPA1 genome-wide will contribute to understanding the molecular mechanisms underlying plant architecture and to facilitating the breeding of elite varieties with ideal plant architecture.
Cerveau, Nicolas; Gilbert, Clément; Liu, Chao; Garrett, Roger A; Grève, Pierre; Bouchon, Didier; Cordaux, Richard
Transposable elements (TEs) are DNA pieces that are present in almost all the living world at variable genomic density. Due to their mobility and density, TEs are involved in a large array of genomic modifications. In eukaryotes, TE expression has been studied in detail in several species. In prokaryotes, studies of IS expression are generally linked to particular copies that induce a modification of neighboring gene expression. Here we investigated global patterns of IS transcription in the Alphaproteobacterial endosymbiont Wolbachia wVulC, using both RT-PCR and bioinformatic analyses. We detected several transcriptional promoters in all IS groups. Nevertheless, only one of the potentially functional IS groups possesses a promoter located upstream of the transposase gene, that could lead up to the production of a functional protein. We found that the majority of IS groups are expressed whatever their functional status. RT-PCR analyses indicate that the transcription of two IS groups lacking internal promoters upstream of the transposase start codon may be driven by the genomic environment. We confirmed this observation with the transcription analysis of individual copies of one IS group. These results suggest that the genomic environment is important for IS expression and it could explain, at least partly, copy number variability of the various IS groups present in the wVulC genome and, more generally, in bacterial genomes. Copyright © 2015 Elsevier B.V. All rights reserved.
Eskiw, C H; Cope, N F; Clay, I; Schoenfelder, S; Nagano, T; Fraser, P
The dynamic compartmental organization of the transcriptional machinery in mammalian nuclei places particular constraints on the spatial organization of the genome. The clustering of active RNA polymerase I transcription units from several chromosomes at nucleoli is probably the best-characterized and universally accepted example. RNA polymerase II localization in mammalian nuclei occurs in distinct concentrated foci that are several-fold fewer in number compared to the number of active genes and transcription units. Individual transcribed genes cluster at these shared transcription factories in a nonrandom manner, preferentially associating with heterologous, coregulated genes. We suggest that the three-dimensional (3D) conformation and relative arrangement of chromosomes in the nucleus has a major role in delivering tissue-specific gene-expression programs.
Rodionov, Dmitry A.; Novichkov, Pavel; Stavrovskaya, Elena D.; Rodionova, Irina A.; Li, Xiaoqing; Kazanov, Marat D.; Ravcheev, Dmitry A.; Gerasimova, Anna V.; Kazakov, Alexey E.; Kovaleva, Galina Y.; Permina, Elizabeth A.; Laikova, Olga N.; Overbeek, Ross; Romine, Margaret F.; Fredrickson, Jim K.; Arkin, Adam P.; Dubchak, Inna; Osterman, Andrei L.; Gelfand, Mikhail S.
Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. Despite the growing number of genome-scale gene expression studies, our abilities to convert the results of these studies into accurate regulatory annotations and to project them from model to other organisms are extremely limited. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. However, even orthologous regulators with conserved DNA-binding motifs may control substantially different gene sets, revealing striking differences in regulatory strategies between the Shewanella spp. and E. coli. Multiple examples of regulatory network rewiring include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), and numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. NagR for N-acetylglucosamine catabolism and PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp).
Piatek, Agnieszka Anna
Regulation of gene transcription controls cellular functions and coordinates responses to developmental, physiological and environmental cues. Precise and efficient molecular tools are needed to characterize the functions of single and multiple genes in linear and interacting pathways in a native context. Modular DNA-binding domains from zinc fingers (ZFs) and transcriptional activator-like proteins (TALE) are amenable to bioengineering to bind DNA target sequences of interest. As a result, ZF and TALE proteins were used to develop synthetic programmable transcription factors. However, these systems are limited by the requirement to re-engineer proteins for each new target sequence. The clustered regularly interspaced palindromic repeats (CRISPR)/CRISPR associated 9 (Cas9) genome editing tool was recently repurposed for targeted transcriptional regulation by inactivation of the nuclease activity of Cas9. Due to the facile engineering, simplicity, precision and amenability to library construction, the CRISPR/Cas9 system is poised to revolutionize the functional genomics field across diverse eukaryotic species. In this review, we discuss the development of synthetic customizable transcriptional regulators and provide insights into their current and potential applications, with special emphasis on plant systems, in characterization of gene functions, elucidation of molecular mechanisms and their biotechnological applications. © 2016 Informa UK Limited, trading as Taylor & Francis Group
Valen, Eivind; Sandelin, Albin Gustav
A central question in cellular biology is how the cell regulates transcription and discerns when and where to initiate it. Locating transcription start sites (TSSs), the signals that specify them, and ultimately elucidating the mechanisms of regulated initiation has therefore been a recurrent theme......; the field is now faced with the daunting challenge of translating these descriptive maps into quantitative and predictive models describing the underlying biology. We review here the genomic and chromatin features that underlie TSS selection and usage, focusing on the differences between the major classes...
Full Text Available Lolium perenne, which is a major component of pastures, lawns, and grass strips, can be exposed to xenobiotic stresses due to diffuse and residual contaminations of soil. L. perenne was recently shown to undergo metabolic adjustments in response to sub-toxic levels of xenobiotics. To gain insight in such chemical stress responses, a de novo transcriptome analysis was carried out on leaves from plants subjected at the root level to low levels of xenobiotics, glyphosate, tebuconazole, and a combination of the two, leading to no adverse physiological effect. Chemical treatments influenced significantly the relative proportions of functional categories and of transcripts related to carbohydrate processes, to signalling, to protein-kinase cascades, as Serine/Threonine-protein kinases, to transcriptional regulations, to responses to abiotic or biotic stimuli and to responses to phytohormones. Transcriptomics-based expressions of genes encoding different types of SNF1 (sucrose non-fermenting 1-related kinases involved in sugar and stress signalling or encoding key metabolic enzymes were in line with specific qRT-PCR analysis or with the important metabolic and regulatory changes revealed by metabolomic analysis. The effects of pesticide treatments on metabolites and gene expression strongly suggest that pesticides at low levels, as single molecule or as mixture, affect cell signalling and functioning even in the absence of major physiological impact. This global analysis of L. perenne therefore highlighted the interactions between molecular regulation of responses to xenobiotics, and also carbohydrate dynamics, energy dysfunction, phytohormones and calcium signalling.
Joseph M Gonzales
Full Text Available The determinants of transcriptional regulation in malaria parasites remain elusive. The presence of a well-characterized gene expression cascade shared by different Plasmodium falciparum strains could imply that transcriptional regulation and its natural variation do not contribute significantly to the evolution of parasite drug resistance. To clarify the role of transcriptional variation as a source of stain-specific diversity in the most deadly malaria species and to find genetic loci that dictate variations in gene expression, we examined genome-wide expression level polymorphisms (ELPs in a genetic cross between phenotypically distinct parasite clones. Significant variation in gene expression is observed through direct co-hybridizations of RNA from different P. falciparum clones. Nearly 18% of genes were regulated by a significant expression quantitative trait locus. The genetic determinants of most of these ELPs resided in hotspots that are physically distant from their targets. The most prominent regulatory locus, influencing 269 transcripts, coincided with a Chromosome 5 amplification event carrying the drug resistance gene, pfmdr1, and 13 other genes. Drug selection pressure in the Dd2 parental clone lineage led not only to a copy number change in the pfmdr1 gene but also to an increased copy number of putative neighboring regulatory factors that, in turn, broadly influence the transcriptional network. Previously unrecognized transcriptional variation, controlled by polymorphic regulatory genes and possibly master regulators within large copy number variants, contributes to sweeping phenotypic evolution in drug-resistant malaria parasites.
Smyth Gordon K
Full Text Available Abstract Background Signal transducer and activator of transcription (STAT proteins are key regulators of gene expression in response to the interferon (IFN family of anti-viral and anti-microbial cytokines. We have examined the genomic relationship between STAT1 binding and regulated transcription using multiple tiling microarray and chromatin immunoprecipitation microarray (ChIP-chip experiments from public repositories. Results In response to IFN-γ, STAT1 bound proximally to regions of the genome that exhibit regulated transcriptional activity. This finding was consistent between different tiling microarray platforms, and between different measures of transcriptional activity, including differential binding of RNA polymerase II, and differential mRNA transcription. Re-analysis of tiling microarray data from a recent study of IFN-γ-induced STAT1 ChIP-chip and mRNA expression revealed that STAT1 binding is tightly associated with localized mRNA transcription in response to IFN-γ. Close relationships were also apparent between STAT1 binding, STAT2 binding, and mRNA transcription in response to IFN-α. Furthermore, we found that sites of STAT1 binding within the Encyclopedia of DNA Elements (ENCODE region are precisely correlated with sites of either enhanced or diminished binding by the RNA polymerase II complex. Conclusion Together, our results indicate that STAT1 binds proximally to regions of the genome that exhibit regulated transcriptional activity. This finding establishes a generalized basis for the positioning of STAT1 binding sites within the genome, and supports a role for STAT1 in the direct recruitment of the RNA polymerase II complex to the promoters of IFN-γ-responsive genes.
Goldstein, Ido; Hager, Gordon L.
An elaborate metabolic response to fasting is orchestrated by the liver and is heavily reliant upon transcriptional regulation. In response to hormones (glucagon, glucocorticoids) many transcription factors (TFs) are activated and regulate various genes involved in metabolic pathways aimed at restoring homeostasis: gluconeogenesis, fatty acid oxidation, ketogenesis and amino acid shuttling. We summarize the recent discoveries regarding fasting-related TFs with an emphasis on genome-wide binding patterns. Collectively, the summarized findings reveal a large degree of co-operation between TFs during fasting which occurs at motif-rich DNA sites bound by a combination of TFs. These new findings implicate transcriptional and chromatin regulation as major determinants of the response to fasting and unravels the complex, multi-TF nature of this response. PMID:26520657
Sanjana, Neville E; Cong, Le; Zhou, Yang; Cunniff, Margaret M; Feng, Guoping; Zhang, Feng
Transcription activator-like effectors (TALEs) are a class of naturally occurring DNA-binding proteins found in the plant pathogen Xanthomonas sp. The DNA-binding domain of each TALE consists of tandem 34-amino acid repeat modules that can be rearranged according to a simple cipher to target new DNA sequences. Customized TALEs can be used for a wide variety of genome engineering applications, including transcriptional modulation and genome editing. Here we describe a toolbox for rapid construction of custom TALE transcription factors (TALE-TFs) and nucleases (TALENs) using a hierarchical ligation procedure. This toolbox facilitates affordable and rapid construction of custom TALE-TFs and TALENs within 1 week and can be easily scaled up to construct TALEs for multiple targets in parallel. We also provide details for testing the activity in mammalian cells of custom TALE-TFs and TALENs using quantitative reverse-transcription PCR and Surveyor nuclease, respectively. The TALE toolbox described here will enable a broad range of biological applications.
Full Text Available An essential component of genome function is the syntax of genomic regulatory elements that determine how diverse transcription factors interact to orchestrate a program of regulatory control. A precise characterization of in vivo spacing constraints between key transcription factors would reveal key aspects of this genomic regulatory language. To discover novel transcription factor spatial binding constraints in vivo, we developed a new integrative computational method, genome wide event finding and motif discovery (GEM. GEM resolves ChIP data into explanatory motifs and binding events at high spatial resolution by linking binding event discovery and motif discovery with positional priors in the context of a generative probabilistic model of ChIP data and genome sequence. GEM analysis of 63 transcription factors in 214 ENCODE human ChIP-Seq experiments recovers more known factor motifs than other contemporary methods, and discovers six new motifs for factors with unknown binding specificity. GEM's adaptive learning of binding-event read distributions allows it to further improve upon previous methods for processing ChIP-Seq and ChIP-exo data to yield unsurpassed spatial resolution and discovery of closely spaced binding events of the same factor. In a systematic analysis of in vivo sequence-specific transcription factor binding using GEM, we have found hundreds of spatial binding constraints between factors. GEM found 37 examples of factor binding constraints in mouse ES cells, including strong distance-specific constraints between Klf4 and other key regulatory factors. In human ENCODE data, GEM found 390 examples of spatially constrained pair-wise binding, including such novel pairs as c-Fos:c-Jun/USF1, CTCF/Egr1, and HNF4A/FOXA1. The discovery of new factor-factor spatial constraints in ChIP data is significant because it proposes testable models for regulatory factor interactions that will help elucidate genome function and the
Guo, Yuchun; Mahony, Shaun; Gifford, David K
An essential component of genome function is the syntax of genomic regulatory elements that determine how diverse transcription factors interact to orchestrate a program of regulatory control. A precise characterization of in vivo spacing constraints between key transcription factors would reveal key aspects of this genomic regulatory language. To discover novel transcription factor spatial binding constraints in vivo, we developed a new integrative computational method, genome wide event finding and motif discovery (GEM). GEM resolves ChIP data into explanatory motifs and binding events at high spatial resolution by linking binding event discovery and motif discovery with positional priors in the context of a generative probabilistic model of ChIP data and genome sequence. GEM analysis of 63 transcription factors in 214 ENCODE human ChIP-Seq experiments recovers more known factor motifs than other contemporary methods, and discovers six new motifs for factors with unknown binding specificity. GEM's adaptive learning of binding-event read distributions allows it to further improve upon previous methods for processing ChIP-Seq and ChIP-exo data to yield unsurpassed spatial resolution and discovery of closely spaced binding events of the same factor. In a systematic analysis of in vivo sequence-specific transcription factor binding using GEM, we have found hundreds of spatial binding constraints between factors. GEM found 37 examples of factor binding constraints in mouse ES cells, including strong distance-specific constraints between Klf4 and other key regulatory factors. In human ENCODE data, GEM found 390 examples of spatially constrained pair-wise binding, including such novel pairs as c-Fos:c-Jun/USF1, CTCF/Egr1, and HNF4A/FOXA1. The discovery of new factor-factor spatial constraints in ChIP data is significant because it proposes testable models for regulatory factor interactions that will help elucidate genome function and the implementation of combinatorial
Semen A Leyn
Full Text Available Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.
Cornish, Joseph P; Matthews, Fialelei; Thomas, Julien R; Erill, Ivan
The assumption of basic properties, like self-regulation, in simple transcriptional regulatory networks can be exploited to infer regulatory motifs from the growing amounts of genomic and meta-genomic data. These motifs can in principle be used to elucidate the nature and scope of transcriptional networks through comparative genomics. Here we assess the feasibility of this approach using the SOS regulatory network of Gram-positive bacteria as a test case. Using experimentally validated data, we show that the known regulatory motif can be inferred through the assumption of self-regulation. Furthermore, the inferred motif provides a more robust search pattern for comparative genomics than the experimental motifs defined in reference organisms. We take advantage of this robustness to generate a functional map of the SOS response in Gram-positive bacteria. Our results reveal definite differences in the composition of the LexA regulon between Firmicutes and Actinobacteria, and confirm that regulation of cell-division inhibition is a widespread characteristic of this network among Gram-positive bacteria.
Marc P Hoeppner
Full Text Available The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.
In this issue of Molecular Cell, Vvedenskaya et al. (2015) describe a high-throughput sequencing-based methodology for the massively parallel analysis of transcription from a high-complexity barcoded template library both in vitro and in vivo, providing a powerful new tool for the study of transcription.
Rosel, J L; Earl, P L; Weir, J P; Moss, B
The sequence of the 8,600-base-pair HindIII H fragment, located at the center of the vaccinia virus genome, was determined to analyze several late genes. Seven major complete open reading frames (ORFs) and two that started from or continued into adjacent DNA segments were identified. ORFs were closely spaced and present on both DNA strands. Some adjacent ORFs had oppositely oriented overlapping termination codons or contiguous stop and start codons. Nucleotide compositional analysis indicated that the A-T frequency was consistently lowest in the first codon position. The sizes of the polypeptides predicted from the DNA sequence were compared with those determined by polyacrylamide gel electrophoresis of cell-free translation products of mRNAs selected by hybridization to cloned single-stranded DNA segments or synthesized in vitro by bacteriophage T7 RNA polymerase. Six transcripts that initiated within the HindIII H DNA fragment were detected, and of these, four were synthesized only at late times, one was synthesized only early, and one was synthesized early and late. The sites on the genome corresponding to the 5' ends of the transcripts were located by high-resolution nuclease S1 analysis. For late genes, the transcriptional and translational initiation sites mapped within a few nucleotides of each other, and in each case the sequence TAAATGG occurred at the start of the ORF. The extremely short leader and the absence of A or G in the -3 position, relative to the first nucleotide of the initiation codon, distinguishes the majority of vaccinia virus late genes from eucaryotic and vaccinia virus early genes.
Vivek-Ananth, R P; Samal, Areejit
A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks.
Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.
Pezzella, Cinzia; Lettera, Vincenzo; Piscitelli, Alessandra; Giardina, Paola; Sannia, Giovanni
Fungal laccases (p-diphenol:oxygen oxidoreductase; EC 18.104.22.168) are multi-copper-containing oxidases that catalyse the oxidation of a great variety of phenolic compounds and aromatic amines through simultaneous reduction of molecular oxygen to water. Fungi generally produce several laccase isoenzymes encoded by complex multi-gene families. The Pleurotus ostreatus genome encodes 11 putative laccase coding genes, and only six different laccase isoenzymes have been isolated and characterised so far. Laccase expression was found to be regulated by culture conditions and developmental stages even if the redundancy of these genes still raises the question about their respective functions in vivo. In this context, laccase transcript profiling analysis has been used to unravel the physiological role played by the different isoforms produced by P. ostreatus. Even if reported results depict a complex picture of the transcriptional responses exhibited by the analysed laccase genes, they were allowed to speculate on the isoform role in vivo. Among the produced laccases, LACC10 (POXC) seems to play a major role during vegetative growth, since its transcription is downregulated when the fungus starts the fructification process. Furthermore, a new tessera has been added to the puzzling mosaic of the heterodimeric laccase LACC2 (POXA3). LACC2 small subunit seems to play an additional physiological role during fructification, beside that of LACC2 complex activation/stabilisation.
Soil water deficit is one of the major factors limiting plant productivity. Plants cope with this adverse environmental condition by coordinating the up- or downregulation of an array of stress responsive genes. Reprogramming the expression of these genes leads to rebalanced development and growth that are in concert with the reduced water availability and that ultimately confer enhanced stress tolerance. Currently, several techniques have been employed to monitor genome-wide transcriptional reprogramming under drought stress. The results from these high throughput studies indicate that drought stress-induced transcriptional reprogramming is dynamic, has temporal and spatial specificity, and is coupled with the circadian clock and phytohormone signaling pathways. © 2012 Springer-Verlag Berlin Heidelberg. All rights are reserved.
Wang, Guang-Zhong; Hickey, Stephanie L; Shi, Lei; Huang, Hung-Chung; Nakashe, Prachi; Koike, Nobuya; Tu, Benjamin P; Takahashi, Joseph S; Konopka, Genevieve
Genes expressing circadian RNA rhythms are enriched for metabolic pathways, but the adaptive significance of cyclic gene expression remains unclear. We estimated the genome-wide synthetic and degradative cost of transcription and translation in three organisms and found that the cost of cycling genes is strikingly higher compared to non-cycling genes. Cycling genes are expressed at high levels and constitute the most costly proteins to synthesize in the genome. We demonstrate that metabolic cycling is accelerated in yeast grown under higher nutrient flux and the number of cycling genes increases ∼40%, which are achieved by increasing the amplitude and not the mean level of gene expression. These results suggest that rhythmic gene expression optimizes the metabolic cost of global gene expression and that highly expressed genes have been selected to be downregulated in a cyclic manner for energy conservation.
Full Text Available BACKGROUND: Excessive exposure to dietary fats is an important factor in the initiation of obesity and metabolic syndrome associated pathologies. The cellular processes associated with the onset and progression of diet-induced metabolic syndrome are insufficiently understood. PRINCIPAL FINDINGS: To identify the mechanisms underlying the pathological changes associated with short and long-term exposure to excess dietary fat, hepatic gene expression of ApoE3Leiden mice fed chow and two types of high-fat (HF diets was monitored using microarrays during a 16-week period. A functional characterization of 1663 HF-responsive genes reveals perturbations in lipid, cholesterol and oxidative metabolism, immune and inflammatory responses and stress-related pathways. The major changes in gene expression take place during the early (day 3 and late (week 12 phases of HF feeding. This is also associated with characteristic opposite regulation of many HF-affected pathways between these two phases. The most prominent switch occurs in the expression of inflammatory/immune pathways (early activation, late repression and lipogenic/adipogenic pathways (early repression, late activation. Transcriptional network analysis identifies NF-kappaB, NEMO, Akt, PPARgamma and SREBP1 as the key controllers of these processes and suggests that direct regulatory interactions between these factors may govern the transition from early (stressed, inflammatory to late (pathological, steatotic hepatic adaptation to HF feeding. This transition observed by hepatic gene expression analysis is confirmed by expression of inflammatory proteins in plasma and the late increase in hepatic triglyceride content. In addition, the genes most predictive of fat accumulation in liver during 16-week high-fat feeding period are uncovered by regression analysis of hepatic gene expression and triglyceride levels. CONCLUSIONS: The transition from an inflammatory to a steatotic transcriptional program
Olga V Tsoy
Full Text Available Biological nitrogen fixation plays a crucial role in the nitrogen cycle. An ability to fix atmospheric nitrogen, reducing it to ammonium, was described for multiple species of Bacteria and Archaea. Being a complex and sensitive process, nitrogen fixation requires a complicated regulatory system, also, on the level of transcription. The transcriptional regulatory network for nitrogen fixation was extensively studied in several representatives of the class Alphaproteobacteria. This regulatory network includes the activator of nitrogen fixation NifA, working in tandem with the alternative sigma-factor RpoN as well as oxygen-responsive regulatory systems, one-component regulators FnrN/FixK and two-component system FixLJ. Here we used a comparative genomics analysis for in silico study of the transcriptional regulatory network in 50 genomes of Alphaproteobacteria. We extended the known regulons and proposed the scenario for the evolution of the nitrogen fixation transcriptional network. The reconstructed network substantially expands the existing knowledge of transcriptional regulation in nitrogen-fixing microorganisms and can be used for genetic experiments, metabolic reconstruction, and evolutionary analysis.
Mokry, M.; Hatzis, P.; Schuijers, J.; Lansu, N.; Ruzius, F.P.; Clevers, H.; Cuppen, E.
Routine methods for assaying steady-state mRNA levels such as RNA-seq and micro-arrays are commonly used as readouts to study the role of transcription factors (TFs) in gene expression regulation. However, cellular RNA levels do not solely depend on activity of TFs and subsequent transcription by
Mokry, Michal; Hatzis, Pantelis; Schuijers, Jurian; Lansu, Nico; Ruzius, Frans-Paul; Clevers, Hans; Cuppen, Edwin
Routine methods for assaying steady-state mRNA levels such as RNA-seq and micro-arrays are commonly used as readouts to study the role of transcription factors (TFs) in gene expression regulation. However, cellular RNA levels do not solely depend on activity of TFs and subsequent transcription by RN
Camp, J. Gray; Weiser, Matthew; Cocchiaro, Jordan L.; Kingsley, David M.; Furey, Terrence S.; Sheikh, Shehzad Z.; Rawls, John F.
The intestinal epithelium serves critical physiologic functions that are shared among all vertebrates. However, it is unknown how the transcriptional regulatory mechanisms underlying these functions have changed over the course of vertebrate evolution. We generated genome-wide mRNA and accessible chromatin data from adult intestinal epithelial cells (IECs) in zebrafish, stickleback, mouse, and human species to determine if conserved IEC functions are achieved through common transcriptional regulation. We found evidence for substantial common regulation and conservation of gene expression regionally along the length of the intestine from fish to mammals and identified a core set of genes comprising a vertebrate IEC signature. We also identified transcriptional start sites and other putative regulatory regions that are differentially accessible in IECs in all 4 species. Although these sites rarely showed sequence conservation from fish to mammals, surprisingly, they drove highly conserved IEC expression in a zebrafish reporter assay. Common putative transcription factor binding sites (TFBS) found at these sites in multiple species indicate that sequence conservation alone is insufficient to identify much of the functionally conserved IEC regulatory information. Among the rare, highly sequence-conserved, IEC-specific regulatory regions, we discovered an ancient enhancer upstream from her6/HES1 that is active in a distinct population of Notch-positive cells in the intestinal epithelium. Together, these results show how combining accessible chromatin and mRNA datasets with TFBS prediction and in vivo reporter assays can reveal tissue-specific regulatory information conserved across 420 million years of vertebrate evolution. We define an IEC transcriptional regulatory network that is shared between fish and mammals and establish an experimental platform for studying how evolutionarily distilled regulatory information commonly controls IEC development and physiology. PMID
Ishihama, Akira; Shimada, Tomohiro; Yamazaki, Yukiko
Bacterial genomes are transcribed by DNA-dependent RNA polymerase (RNAP), which achieves gene selectivity through interaction with sigma factors that recognize promoters, and transcription factors (TFs) that control the activity and specificity of RNAP holoenzyme. To understand the molecular mechanisms of transcriptional regulation, the identification of regulatory targets is needed for all these factors. We then performed genomic SELEX screenings of targets under the control of each sigma factor and each TF. Here we describe the assembly of 156 SELEX patterns of a total of 116 TFs performed in the presence and absence of effector ligands. The results reveal several novel concepts: (i) each TF regulates more targets than hitherto recognized; (ii) each promoter is regulated by more TFs than hitherto recognized; and (iii) the binding sites of some TFs are located within operons and even inside open reading frames. The binding sites of a set of global regulators, including cAMP receptor protein, LeuO and Lrp, overlap with those of the silencer H-NS, suggesting that certain global regulators play an anti-silencing role. To facilitate sharing of these accumulated SELEX datasets with the research community, we compiled a database, 'Transcription Profile of Escherichia coli' (www.shigen.nig.ac.jp/ecoli/tec/). © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Full Text Available Gamma-retroviruses and lentiviruses integrate non-randomly in mammalian genomes, with specific preferences for active chromatin, promoters and regulatory regions. Gene transfer vectors derived from gamma-retroviruses target at high frequency genes involved in the control of growth, development and differentiation of the target cell, and may induce insertional tumors or pre-neoplastic clonal expansions in patients treated by gene therapy. The gene expression program of the target cell is apparently instrumental in directing gamma-retroviral integration, although the molecular basis of this phenomenon is poorly understood. We report a bioinformatic analysis of the distribution of transcription factor binding sites (TFBSs flanking >4,000 integrated proviruses in human hematopoietic and non-hematopoietic cells. We show that gamma-retroviral, but not lentiviral vectors, integrate in genomic regions enriched in cell-type specific subsets of TFBSs, independently from their relative position with respect to genes and transcription start sites. Analysis of sequences flanking the integration sites of Moloney leukemia virus (MLV- and human immunodeficiency virus (HIV-derived vectors carrying mutations in their long terminal repeats (LTRs, and of HIV vectors packaged with an MLV integrase, indicates that the MLV integrase and LTR enhancer are the viral determinants of the selection of TFBS-rich regions in the genome. This study identifies TFBSs as differential genomic determinants of retroviral target site selection in the human genome, and suggests that transcription factors binding the LTR enhancer may synergize with the integrase in tethering retroviral pre-integration complexes to transcriptionally active regulatory regions. Our data indicate that gamma-retroviruses and lentiviruses have evolved dramatically different strategies to interact with the host cell chromatin, and predict a higher risk in using gamma-retroviral vs. lentiviral vectors for human
Full Text Available WRKY proteins are members of a family of transcription factors in higher plants that function in plant responses to various physiological processes. We identified 120 candidate WRKY genes from Gossypium raimondii with corresponding expressed sequence tags in at least one of four cotton species, Gossypium hirsutum, Gossypium barbadense, Gossypium arboreum, and G. raimondii. These WRKY members were anchored on 13 chromosomes in G. raimondii with uneven distribution. Phylogenetic analysis showed that WRKY candidate genes can be classified into three groups, with 20 members in group I, 88 in group II, and 12 in group III. The 88 genes in group II were further classified into five subgroups, groups IIa–e, containing 7, 16, 37, 15, and 13 members, respectively. We characterized diversity in amino acid residues in the WRKY domain and/or other zinc finger motif regions in the WRKY proteins. The expression patterns of WRKY genes revealed their important roles in diverse functions in cotton developmental stages of vegetative and reproductive growth and stress response. Structural and expression analyses show that WRKY proteins are a class of important regulators of growth and development and play key roles in response to stresses in cotton.
Caiping; Cai; Erli; Niu; Hao; Du; Liang; Zhao; Yue; Feng; Wangzhen; Guo
WRKY proteins are members of a family of transcription factors in higher plants that function in plant responses to various physiological processes.We identified 120 candidate WRKY genes from Gossypium raimondii with corresponding expressed sequence tags in at least one of four cotton species,Gossypium hirsutum,Gossypium barbadense,Gossypium arboreum,and G.raimondii.These WRKY members were anchored on 13 chromosomes in G.raimondii with uneven distribution.Phylogenetic analysis showed that WRKY candidate genes can be classified into three groups,with 20 members in group I,88 in group II,and 12 in group III.The88 genes in group II were further classified into five subgroups,groups IIa–e,containing 7,16,37,15,and 13 members,respectively.We characterized diversity in amino acid residues in the WRKY domain and/or other zinc finger motif regions in the WRKY proteins.The expression patterns of WRKY genes revealed their important roles in diverse functions in cotton developmental stages of vegetative and reproductive growth and stress response.Structural and expression analyses show that WRKY proteins are a class of important regulators of growth and development and play key roles in response to stresses in cotton.
Lang, Fengchao; Li, Xin; Vladimirova, Olga; Hu, Benxia; Chen, Guijun; Xiao, Yu; Singh, Vikrant; Lu, Danfeng; Li, Lihong; Han, Hongbo; Wickramasinghe, J. M. A. S. P.; Smith, Sheryl T.; Zheng, Chunfu; Li, Qihan; Lieberman, Paul M.; Fraser, Nigel W.; Zhou, Jumin
CTCF is an essential chromatin regulator implicated in important nuclear processes including in nuclear organization and transcription. Herpes Simplex Virus-1 (HSV-1) is a ubiquitous human pathogen, which enters productive infection in human epithelial and many other cell types. CTCF is known to bind several sites in the HSV-1 genome during latency and reactivation, but its function has not been defined. Here, we report that CTCF interacts extensively with the HSV-1 DNA during lytic infection by ChIP-seq, and its knockdown results in the reduction of viral transcription, viral genome copy number and virus yield. CTCF knockdown led to increased H3K9me3 and H3K27me3, and a reduction of RNA pol II occupancy on viral genes. Importantly, ChIP-seq analysis revealed that there is a higher level of CTD Ser2P modified RNA Pol II near CTCF peaks relative to the Ser5P form in the viral genome. Consistent with this, CTCF knockdown reduced the Ser2P but increased Ser5P modified forms of RNA Pol II on viral genes. These results suggest that CTCF promotes HSV-1 lytic transcription by facilitating the elongation of RNA Pol II and preventing silenced chromatin on the viral genome. PMID:28045091
Spanier, Katina I.; Jansen, Mieke; Decaestecker, Ellen; Hulselmans, Gert; Becker, Dörthe; Colbourne, John K.; Orsini, Luisa
Abstract Ecological genomics aims to understand the functional association between environmental gradients and the genes underlying adaptive traits. Many genes that are identified by genome-wide screening in ecologically relevant species lack functional annotations. Although gene functions can be inferred from sequence homology, such approaches have limited power. Here, we introduce ecological regulatory genomics by presenting an ontology-free gene prioritization method. Specifically, our method combines transcriptome profiling with high-throughput cis-regulatory sequence analysis in the water fleas Daphnia pulex and Daphnia magna. It screens coexpressed genes for overrepresented DNA motifs that serve as transcription factor binding sites, thereby providing insight into conserved transcription factors and gene regulatory networks shaping the expression profile. We first validated our method, called Daphnia-cisTarget, on a D. pulex heat shock data set, which revealed a network driven by the heat shock factor. Next, we performed RNA-Seq in D. magna exposed to the cyanobacterium Microcystis aeruginosa. Daphnia-cisTarget identified coregulated gene networks that associate with the moulting cycle and potentially regulate life history changes in growth rate and age at maturity. These networks are predicted to be regulated by evolutionary conserved transcription factors such as the homologues of Drosophila Shavenbaby and Grainyhead, nuclear receptors, and a GATA family member. In conclusion, our approach allows prioritising candidate genes in Daphnia without bias towards prior knowledge about functional gene annotation and represents an important step towards exploring the molecular mechanisms of ecological responses in organisms with poorly annotated genomes. PMID:28854641
Genomic analyses are commonly used to infer trends and broad rules underlying transcriptional control. The innovative approach by Tong et al. to interrogate genomic datasets allows extracting mechanistic information on the specific regulation of individual genes.
Yang, X H; Li, X G; Li, B L; Zhang, D Q
Wood formation occurs via cell division, primary cell wall and secondary wall formation, and programmed cell death in the vascular cambium. Transcriptional profiling of secondary xylem differentiation is essential for understanding the molecular mechanisms underlying wood formation. Differential gene expression in secondary xylem differentiation of Populus has been previously investigated using cDNA microarray analysis. However, little is known about the molecular mechanisms from a genome-wide perspective. In this study, the Affymetrix poplar genome chips containing 61,413 probes were used to investigate the changes in the transcriptome during secondary xylem differentiation in Chinese white poplar (Populus tomentosa). Two xylem tissues (newly formed and lignified) were sampled for genome-wide transcriptional profiling. In total, 6843 genes (~11%) were identified with differential expression in the two xylem tissues. Many genes involved in cell division, primary wall modification, and cellulose synthesis were preferentially expressed in the newly formed xylem. In contrast, many genes, including 4-coumarate:cinnamate-4-hydroxylase (C4H), 4-coumarate:CoA ligase (4CL), cinnamyl alcohol dehydrogenase (CAD), and caffeoyl CoA 3-O-methyltransferase (CCoAOMT), associated with lignin biosynthesis were more transcribed in the lignified xylem. The two xylem tissues also showed differential expression of genes related to various hormones; thus, the secondary xylem differentiation could be regulated by hormone signaling. Furthermore, many transcription factor genes were preferentially expressed in the lignified xylem, suggesting that wood lignification involves extensive transcription regulation. The genome-wide transcriptional profiling of secondary xylem differentiation could provide additional insights into the molecular basis of wood formation in poplar species.
Tian, Erming; Børset, Magne; Sawyer, Jeffrey R; Brede, Gaute; Våtsveen, Thea K; Hov, Håkon; Waage, Anders; Barlogie, Bart; Shaughnessy, John D; Epstein, Joshua; Sundan, Anders
The growth and survival factor hepatocyte growth factor (HGF) is expressed at high levels in multiple myeloma (MM) cells. We report here that elevated HGF transcription in MM was traced to DNA mutations in the promoter alleles of HGF. Sequence analysis revealed a previously undiscovered single-nucleotide polymorphism (SNP) and crucial single-nucleotide variants (SNVs) in the promoters of myeloma cells that produce large amounts of HGF. The allele-specific mutations functionally reassembled wild-type sequences into the motifs that affiliate with endogenous transcription factors NFKB (nuclear factor kappa-B), MZF1 (myeloid zinc finger 1), and NRF-2 (nuclear factor erythroid 2-related factor 2). In vitro, a mutant allele that gained novel NFKB-binding sites directly responded to transcriptional signaling induced by tumor necrosis factor alpha (TNFα) to promote high levels of luciferase reporter. Given the recent discovery by genome-wide sequencing (GWS) of numerous non-coding mutations in myeloma genomes, our data provide evidence that heterogeneous SNVs in the gene regulatory regions may frequently transform wild-type alleles into novel transcription factor binding properties to aberrantly interact with dysregulated transcriptional signals in MM and other cancer cells.
Nguyen-Duc, Trong; van Oeffelen, Liesbeth; Song, Ningning; Hassanzadeh-Ghassabeh, Gholamreza; Muyldermans, Serge; Charlier, Daniel; Peeters, Eveline
Gene regulatory processes are largely resulting from binding of transcription factors to specific genomic targets. Leucine-responsive Regulatory Protein (Lrp) is a prevalent transcription factor family in prokaryotes, however, little information is available on biological functions of these proteins in archaea. Here, we study genome-wide binding of the Lrp-like transcription factor Ss-LrpB from Sulfolobus solfataricus. Chromatin immunoprecipitation in combination with DNA microarray analysis (ChIP-chip) has revealed that Ss-LrpB interacts with 36 additional loci besides the four previously identified local targets. Only a subset of the newly identified binding targets, concentrated in a highly variable IS-dense genomic region, is also bound in vitro by pure Ss-LrpB. There is no clear relationship between the in vitro measured DNA-binding specificity of Ss-LrpB and the in vivo association suggesting a limited permissivity of the crenarchaeal chromatin for transcription factor binding. Of 37 identified binding regions, 29 are co-bound by LysM, another Lrp-like transcription factor in S. solfataricus. Comparative gene expression analysis in an Ss-lrpB mutant strain shows no significant Ss-LrpB-mediated regulation for most targeted genes, with exception of the CRISPR B cluster, which is activated by Ss-LrpB through binding to a specific motif in the leader region. The genome-wide binding profile presented here implies that Ss-LrpB is associated at additional genomic binding sites besides the local gene targets, but acts as a specific transcription regulator in the tested growth conditions. Moreover, we have provided evidence that two Lrp-like transcription factors in S. solfataricus, Ss-LrpB and LysM, interact in vivo.
Planta, R J; Brown, A J; Cadahia, J L; Cerdan, M E; de Jonge, M; Gent, M E; Hayes, A; Kolen, C P; Lombardia, L J; Sefton, M; Oliver, S G; Thevelein, J; Tournu, H; van Delft, Y J; Verbart, D J; Winderickx, J
The European Functional Analysis Network (EUROFAN) is systematically analysing the function of novel Saccharomyces cerevisiae genes revealed by genome sequencing. As part of this effort our consortium has performed a detailed transcript analysis for 250 novel ORFs on chromosome XIV. All transcripts were quantified by Northern analysis under three quasi-steady-state conditions (exponential growth on rich fermentative, rich non-fermentative, and minimal fermentative media) and eight transient conditions (glucose derepression, glucose upshift, stationary phase, nitrogen starvation, osmo-stress, heat-shock, and two control conditions). Transcripts were detected for 82% of the 250 ORFs, and only one ORF did not yield a transcript of the expected length (YNL285w). Transcripts ranged from low (62%), moderate (16%) to high abundance (2%) relative to the ACT1 mRNA. The levels of 73% of the 206 chromosome XIV transcripts detected fluctuated in response to the transient states tested. However, only a small number responded strongly to the transients: eight ORFs were induced upon glucose upshift; five were repressed by glucose; six were induced in response to nitrogen starvation; three were induced in stationary phase; five were induced by osmo-stress; four were induced by heat-shock. These data provide useful clues about the general function of these ORFs and add to our understanding of gene regulation on a genome-wide basis.
Seo, Sang Woo; Gao, Ye; Kim, Donghyuk
A transcription factor (TF), OmpR, plays a critical role in transcriptional regulation of the osmotic stress response in bacteria. Here, we reveal a genome-scale OmpR regulon in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 37 genes in 24 transcription units (TUs...
Arshanovskii, Kirill; Gusev, Oleg; Sychev, Vladimir; Poddubko, Svetlana; Deviatiiarov, Ruslan
In order to gen new insights of gene regulation changes under conditions of real spaceflight, we have conducted whole-genome analysis of dynamic of promotes and enhancers transcriptional changes in zebrafish during prolonged exposure to real spaceflight. In the frame of Russia-Japan joint experiments "Aquatic Habitat"-"Aquarium" we have conducted Cap Analysis of Gene Expression (CAGE) assay of zebrafish in the rage from 7 to 40 days of real spaceflight onboard ISS. The analysis showed that both gene expression patterns and architecture of shapes and types of the promoters are affected by spaceflight environment.
Full Text Available Despite explosive growth in genomic datasets, the methods for studying epigenomic mechanisms of gene regulation remain primitive. Here we present a model-based approach to systematically analyze the epigenomic functions in modulating transcription factor-DNA binding. Based on the first principles of statistical mechanics, this model considers the interactions between epigenomic modifications and a cis-regulatory module, which contains multiple binding sites arranged in any configurations. We compiled a comprehensive epigenomic dataset in mouse embryonic stem (mES cells, including DNA methylation (MeDIP-seq and MRE-seq, DNA hydroxymethylation (5-hmC-seq, and histone modifications (ChIP-seq. We discovered correlations of transcription factors (TFs for specific combinations of epigenomic modifications, which we term epigenomic motifs. Epigenomic motifs explained why some TFs appeared to have different DNA binding motifs derived from in vivo (ChIP-seq and in vitro experiments. Theoretical analyses suggested that the epigenome can modulate transcriptional noise and boost the cooperativity of weak TF binding sites. ChIP-seq data suggested that epigenomic boost of binding affinities in weak TF binding sites can function in mES cells. We showed in theory that the epigenome should suppress the TF binding differences on SNP-containing binding sites in two people. Using personal data, we identified strong associations between H3K4me2/H3K9ac and the degree of personal differences in NFκB binding in SNP-containing binding sites, which may explain why some SNPs introduce much smaller personal variations on TF binding than other SNPs. In summary, this model presents a powerful approach to analyze the functions of epigenomic modifications. This model was implemented into an open source program APEG (Affinity Prediction by Epigenome and Genome, http://systemsbio.ucsd.edu/apeg.
Chen, Chieh-Chun; Xiao, Shu; Xie, Dan; Cao, Xiaoyi; Song, Chun-Xiao; Wang, Ting; He, Chuan; Zhong, Sheng
Despite explosive growth in genomic datasets, the methods for studying epigenomic mechanisms of gene regulation remain primitive. Here we present a model-based approach to systematically analyze the epigenomic functions in modulating transcription factor-DNA binding. Based on the first principles of statistical mechanics, this model considers the interactions between epigenomic modifications and a cis-regulatory module, which contains multiple binding sites arranged in any configurations. We compiled a comprehensive epigenomic dataset in mouse embryonic stem (mES) cells, including DNA methylation (MeDIP-seq and MRE-seq), DNA hydroxymethylation (5-hmC-seq), and histone modifications (ChIP-seq). We discovered correlations of transcription factors (TFs) for specific combinations of epigenomic modifications, which we term epigenomic motifs. Epigenomic motifs explained why some TFs appeared to have different DNA binding motifs derived from in vivo (ChIP-seq) and in vitro experiments. Theoretical analyses suggested that the epigenome can modulate transcriptional noise and boost the cooperativity of weak TF binding sites. ChIP-seq data suggested that epigenomic boost of binding affinities in weak TF binding sites can function in mES cells. We showed in theory that the epigenome should suppress the TF binding differences on SNP-containing binding sites in two people. Using personal data, we identified strong associations between H3K4me2/H3K9ac and the degree of personal differences in NFκB binding in SNP-containing binding sites, which may explain why some SNPs introduce much smaller personal variations on TF binding than other SNPs. In summary, this model presents a powerful approach to analyze the functions of epigenomic modifications. This model was implemented into an open source program APEG (Affinity Prediction by Epigenome and Genome, http://systemsbio.ucsd.edu/apeg).
Down Thomas A
Full Text Available Abstract Background DNA methylation can regulate gene expression by modulating the interaction between DNA and proteins or protein complexes. Conserved consensus motifs exist across the human genome ("predicted transcription factor binding sites": "predicted TFBS" but the large majority of these are proven by chromatin immunoprecipitation and high throughput sequencing (ChIP-seq not to be biological transcription factor binding sites ("empirical TFBS". We hypothesize that DNA methylation at conserved consensus motifs prevents promiscuous or disorderly transcription factor binding. Results Using genome-wide methylation maps of the human heart and sperm, we found that all conserved consensus motifs as well as the subset of those that reside outside CpG islands have an aggregate profile of hyper-methylation. In contrast, empirical TFBS with conserved consensus motifs have a profile of hypo-methylation. 40% of empirical TFBS with conserved consensus motifs resided in CpG islands whereas only 7% of all conserved consensus motifs were in CpG islands. Finally we further identified a minority subset of TF whose profiles are either hypo-methylated or neutral at their respective conserved consensus motifs implicating that these TF may be responsible for establishing or maintaining an un-methylated DNA state, or whose binding is not regulated by DNA methylation. Conclusions Our analysis supports the hypothesis that at least for a subset of TF, empirical binding to conserved consensus motifs genome-wide may be controlled by DNA methylation.
Farjo, Q; Jackson, A; Pieke-Dahl, S; Scott, K; Kimberling, W J; Sieving, P A; Richards, J E; Swaroop, A
The NRL gene encodes an evolutionarily conserved basic motif-leucine zipper transcription factor that is implicated in regulating the expression of the photoreceptor-specific gene rhodopsin. NRL is expressed in postmitotic neuronal cells and in lens during embryonic development, but exhibits a retina-specific pattern of expression in the adult. To understand regulation of NRL expression and to investigate its possible involvement in retinopathies, we have determined the complete sequence of the human NRL gene, identified a polymorphic (CA)n repeat (identical to D14S64) within the NRL-containing cosmid, and refined its location by linkage analysis. Since a locus for autosomal recessive retinitis pigmentosa (arRP) has been linked to markers at 14q11 and since mutations in rhodopsin can lead to RP, we sequenced genomic PCR products of the NRL gene and of the rhodopsin-Nrl response element from a panel of patients representing independent families with inherited retinal degeneration. The analysis did not reveal any causative mutations in this group of patients. These investigations provide the basis for delineating the DNA sequence elements that regulate NRL expression in distinct neuronal cell types and should assist in the analysis of NRL as a candidate gene for inherited diseases/syndromes affecting visual function. Copyright 1997 Academic Press.
Kim, Sang Woo; Fishilevich, Elane; Arango-Argoty, Gustavo; Lin, Yuefeng; Liu, Guodong; Li, Zhihua; Monaghan, A Paula; Nichols, Mark; John, Bino
Non-coding RNAs (ncRNAs) play major roles in development and cancer progression. To identify novel ncRNAs that may identify key pathways in breast cancer development, we performed high-throughput transcript profiling of tumor and normal matched-pair tissue samples. Initial transcriptome profiling using high-density genome-wide tiling arrays revealed changes in over 200 novel candidate genomic regions that map to intronic regions. Sixteen genomic loci were identified that map to the long introns of five key protein-coding genes, CRIM1, EPAS1, ZEB2, RBMS1, and RFX2. Consistent with the known role of the tumor suppressor ZEB2 in the cancer-associated epithelial to mesenchymal transition (EMT), in situ hybridization reveals that the intronic regions deriving from ZEB2 as well as those from RFX2 and EPAS1 are down-regulated in cells of epithelial morphology, suggesting that these regions may be important for maintaining normal epithelial cell morphology. Paired-end deep sequencing analysis reveals a large number of distinct genomic clusters with no coding potential within the introns of these genes. These novel transcripts are only transcribed from the coding strand. A comprehensive search for breast cancer associated genes reveals enrichment for transcribed intronic regions from these loci, pointing to an underappreciated role of introns or mechanisms relating to their biology in EMT and breast cancer.
Sang Woo Kim
Full Text Available Non-coding RNAs (ncRNAs play major roles in development and cancer progression. To identify novel ncRNAs that may identify key pathways in breast cancer development, we performed high-throughput transcript profiling of tumor and normal matched-pair tissue samples. Initial transcriptome profiling using high-density genome-wide tiling arrays revealed changes in over 200 novel candidate genomic regions that map to intronic regions. Sixteen genomic loci were identified that map to the long introns of five key protein-coding genes, CRIM1, EPAS1, ZEB2, RBMS1, and RFX2. Consistent with the known role of the tumor suppressor ZEB2 in the cancer-associated epithelial to mesenchymal transition (EMT, in situ hybridization reveals that the intronic regions deriving from ZEB2 as well as those from RFX2 and EPAS1 are down-regulated in cells of epithelial morphology, suggesting that these regions may be important for maintaining normal epithelial cell morphology. Paired-end deep sequencing analysis reveals a large number of distinct genomic clusters with no coding potential within the introns of these genes. These novel transcripts are only transcribed from the coding strand. A comprehensive search for breast cancer associated genes reveals enrichment for transcribed intronic regions from these loci, pointing to an underappreciated role of introns or mechanisms relating to their biology in EMT and breast cancer.
Kurilla, M.G.; Stone, H.O.; Keene, J.D.
The 3' end of the genomic RNA of Newcastle disease virus (NDV) has been sequenced and the leader RNA defined. Using hybridization to a 3'-end-labeled genome, leader RNA species from in vitro transcription reactions and from infected cell extracts were found to be 47 and 53 nucleotides long. In addition, the start site of the 3'-proximal mRNA was determined by sequence analysis of in vitro (beta-32P)GTP-labeled transcription products. The genomic sequence extending beyond the leader region demonstrated an open reading frame for at least 42 amino acids and probably represents the amino terminus of the nucleocapsid protein (NP). The terminal 8 nucleotides of the NDV genome were identical to those of measles virus and Sendai virus while the sequence of the distal half of the leader region was more similar to that of vesicular stomatitis virus. These data argue for strong evolutionary relatedness between the paramyxovirus and rhabdovirus groups.
Bro, Christoffer; Regenberg, Birgitte; Nielsen, Jens
The genome-wide transcriptional response of a Saccharomyces cerevisiae strain deleted in GDH1 that encodes a NADP(+)-dependent glutamate dehydrogenase was compared to a wild-type strain under anaerobic steady-state conditions. The GDH1-deleted strain has a significantly reduced NADPH requirement, and therefore, an altered redox metabolism. Identification of genes with significantly changed expression using a t-test and a Bonferroni correction yielded only 16 transcripts when accepting two false-positives, and 7 of these were Open Reading Frames (ORFs) with unknown function. Among the 16 transcripts the only one with a direct link to redox metabolism was GND1, encoding phosphogluconate dehydrogenase. To extract additional information we analyzed the transcription data for a gene subset consisting of all known genes encoding metabolic enzymes that use NAD(+) or NADP(+). The subset was analyzed for genes with significantly changed expression again with a t-test and correction for multiple testing. This approach was found to enrich the analysis since GND1, ZWF1 and ALD6, encoding the most important enzymes for regeneration of NADPH under anaerobic conditions, were down-regulated together with eight other genes encoding NADP(H)-dependent enzymes. This indicates a possible common redox-dependent regulation of these genes. Furthermore, we showed that it might be necessary to analyze the expression of a subset of genes to extract all available information from global transcription analysis.
Yagi, Yusuke; Shiina, Takashi
Chloroplasts in land plants have a small genome consisting of only 100 genes encoding partial sets of proteins for photosynthesis, transcription and translation. Although it has been thought that chloroplast transcription is mediated by a basically cyanobacterium-derived system, due to the endosymbiotic origin of plastids, recent studies suggest the existence of a hybrid transcription machinery containing non-bacterial proteins that have been newly acquired during plant evolution. Here, we highlight chloroplast-specific non-bacterial transcription mechanisms by which land plant chloroplasts have gained novel functions.
Full Text Available We performed frequency-domain analysis in the genomes of various organisms using tricolor spectrograms, identifying several types of distinct visual patterns characterizing specific DNA regions. We relate patterns and their frequency characteristics to the sequence characteristics of the DNA. At times, the spectrogram patterns could be related to the structure of the corresponding protein region by using various public databases such as GenBank. Some patterns are explained from the biological nature of the corresponding regions, which relate to chromosome structure and protein coding, and some patterns have yet unknown biological significance. We found biologically meaningful patterns, on the scale of millions of base pairs, to a few hundred base pairs. Chromosome-wide patterns include periodicities ranging from 2 to 300. The color of the spectrogram depends on the nucleotide content at specific frequencies, and therefore can be used as a local indicator of CG content and other measures of relative base content. Several smaller-scale patterns were found to represent different types of domains made up of various tandem repeats.
Esquerré, Thomas; Bouvier, Marie; Turlan, Catherine; Carpousis, Agamemnon J; Girbal, Laurence; Cocaign-Bousquet, Muriel
Bacterial adaptation requires large-scale regulation of gene expression. We have performed a genome-wide analysis of the Csr system, which regulates many important cellular functions. The Csr system is involved in post-transcriptional regulation, but a role in transcriptional regulation has also been suggested. Two proteins, an RNA-binding protein CsrA and an atypical signaling protein CsrD, participate in the Csr system. Genome-wide transcript stabilities and levels were compared in wildtype E. coli (MG1655) and isogenic mutant strains deficient in CsrA or CsrD activity demonstrating for the first time that CsrA and CsrD are global negative and positive regulators of transcription, respectively. The role of CsrA in transcription regulation may be indirect due to the 4.6-fold increase in csrD mRNA concentration in the CsrA deficient strain. Transcriptional action of CsrA and CsrD on a few genes was validated by transcriptional fusions. In addition to an effect on transcription, CsrA stabilizes thousands of mRNAs. This is the first demonstration that CsrA is a global positive regulator of mRNA stability. For one hundred genes, we predict that direct control of mRNA stability by CsrA might contribute to metabolic adaptation by regulating expression of genes involved in carbon metabolism and transport independently of transcriptional regulation.
Zhang, Ling; Nemzow, Leah; Chen, Hua; Hu, Jennifer J; Gong, Feng
UV irradiation is known to cause cyclobutane pyrimidine dimers (CPDs) and pyrimidine (6-4) pyrimidone photoproducts (6-4PPs), and plays a large role in the development of cancer. Tumor suppression, through DNA repair and proper cell cycle regulation, is an integral factor in maintaining healthy cells and preventing development of cancer. Transcriptional regulation of the genes involved in the various tumor suppression pathways is essential for them to be expressed when needed and to function properly. BRG1, an ATPase catalytic subunit of the SWI/SNF chromatin remodeling complex, has been identified as a tumor suppressor protein, as it has been shown to play a role in Nucleotide Excision Repair (NER) of CPDs, suppress apoptosis, and restore checkpoint deficiency, in response to UV exposure. Although BRG1 has been shown to regulate transcription of some genes that are instrumental in proper DNA damage repair and cell cycle maintenance in response to UV, its role in transcriptional regulation of the whole genome in response to UV has not yet been elucidated. With whole genome expression profiling in SW13 cells, we show that upon UV induction, BRG1 regulates transcriptional expression of many genes involved in cell stress response. Additionally, our results also highlight BRG1's general role as a master regulator of the genome, as it transcriptionally regulates approximately 4.8% of the human genome, including expression of genes involved in many pathways. RT-PCR and ChIP were used to validate our genome expression analysis. Importantly, our study identifies several novel transcriptional targets of BRG1, such as ATF3. Thus, BRG1 has a larger impact on human genome expression than previously thought, and our studies will provide inroads for future analysis of BRG1's role in gene regulation.
Sprung, Carl N; Yang, Yuqing; Forrester, Helen B; Li, Jason; Zaitseva, Marina; Cann, Leonie; Restall, Tina; Anderson, Robin L; Crosbie, Jeffrey C; Rogers, Peter A W
The majority of cancer patients achieve benefit from radiotherapy. A significant limitation of radiotherapy is its relatively low therapeutic index, defined as the maximum radiation dose that causes acceptable normal tissue damage to the minimum dose required to achieve tumor control. Recently, a new radiotherapy modality using synchrotron-generated X-ray microbeam radiotherapy has been demonstrated in animal models to ablate tumors with concurrent sparing of normal tissue. Very little work has been undertaken into the cellular and molecular mechanisms that differentiate microbeam radiotherapy from broad beam. The purpose of this study was to investigate and compare the whole genome transcriptional response of in vivo microbeam radiotherapy versus broad beam irradiated tumors. We hypothesized that gene expression changes after microbeam radiotherapy are different from those seen after broad beam. We found that in EMT6.5 tumors at 4-48 h postirradiation, microbeam radiotherapy differentially regulates a number of genes, including major histocompatibility complex (MHC) class II antigen gene family members, and other immunity-related genes including Ciita, Ifng, Cxcl1, Cxcl9, Indo and Ubd when compared to broad beam. Our findings demonstrate molecular differences in the tumor response to microbeam versus broad beam irradiation and these differences provide insight into the underlying mechanisms of microbeam radiotherapy and broad beam.
Yildiz, Gokhan; Arslan-Ergul, Ayca; Bagislar, Sevgi; Konu, Ozlen; Yuzugullu, Haluk; Gursoy-Yuzugullu, Ozge; Ozturk, Nuri; Ozen, Cigdem; Ozdag, Hilal; Erdal, Esra; Karademir, Sedat; Sagol, Ozgul; Mizrak, Dilsa; Bozkaya, Hakan; Ilk, Hakki Gokhan; Ilk, Ozlem; Bilen, Biter; Cetin-Atalay, Rengul; Akar, Nejat; Ozturk, Mehmet
Senescence is a permanent proliferation arrest in response to cell stress such as DNA damage. It contributes strongly to tissue aging and serves as a major barrier against tumor development. Most tumor cells are believed to bypass the senescence barrier (become "immortal") by inactivating growth control genes such as TP53 and CDKN2A. They also reactivate telomerase reverse transcriptase. Senescence-to-immortality transition is accompanied by major phenotypic and biochemical changes mediated by genome-wide transcriptional modifications. This appears to happen during hepatocellular carcinoma (HCC) development in patients with liver cirrhosis, however, the accompanying transcriptional changes are virtually unknown. We investigated genome-wide transcriptional changes related to the senescence-to-immortality switch during hepatocellular carcinogenesis. Initially, we performed transcriptome analysis of senescent and immortal clones of Huh7 HCC cell line, and identified genes with significant differential expression to establish a senescence-related gene list. Through the analysis of senescence-related gene expression in different liver tissues we showed that cirrhosis and HCC display expression patterns compatible with senescent and immortal phenotypes, respectively; dysplasia being a transitional state. Gene set enrichment analysis revealed that cirrhosis/senescence-associated genes were preferentially expressed in non-tumor tissues, less malignant tumors, and differentiated or senescent cells. In contrast, HCC/immortality genes were up-regulated in tumor tissues, or more malignant tumors and progenitor cells. In HCC tumors and immortal cells genes involved in DNA repair, cell cycle, telomere extension and branched chain amino acid metabolism were up-regulated, whereas genes involved in cell signaling, as well as in drug, lipid, retinoid and glycolytic metabolism were down-regulated. Based on these distinctive gene expression features we developed a 15-gene
Full Text Available Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR. Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III–IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I–II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Kayal, Ehsan; Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A; Lindsay, Dhugal J; Hopcroft, Russell R; Collins, Allen G
Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III-IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I-II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Jesse R Raab
Full Text Available Multiple positions within the SWI/SNF chromatin remodeling complex can be filled by mutually exclusive subunits. Inclusion or exclusion of these proteins defines many unique forms of SWI/SNF and has profound functional consequences. Often this complex is studied as a single entity within a particular cell type and we understand little about the functional relationship between these biochemically distinct forms of the remodeling complex. Here we examine the functional relationships among three complex-specific ARID (AT-Rich Interacting Domain subunits using genome-wide chromatin immunoprecipitation, transcriptome analysis, and transcription factor binding maps. We find widespread overlap in transcriptional regulation and the genomic binding of distinct SWI/SNF complexes. ARID1B and ARID2 participate in wide-spread cooperation to repress hundreds of genes. Additionally, we find numerous examples of competition between ARID1A and another ARID, and validate that gene expression changes following loss of one ARID are dependent on the function of an alternative ARID. These distinct regulatory modalities are correlated with differential occupancy by transcription factors. Together, these data suggest that distinct SWI/SNF complexes dictate gene-specific transcription through functional interactions between the different forms of the SWI/SNF complex and associated co-factors. Most genes regulated by SWI/SNF are controlled by multiple biochemically distinct forms of the complex, and the overall expression of a gene is the product of the interaction between these different SWI/SNF complexes. The three mutually exclusive ARID family members are among the most frequently mutated chromatin regulators in cancer, and understanding the functional interactions and their role in transcriptional regulation provides an important foundation to understand their role in cancer.
Raab, Jesse R.; Resnick, Samuel; Magnuson, Terry
Multiple positions within the SWI/SNF chromatin remodeling complex can be filled by mutually exclusive subunits. Inclusion or exclusion of these proteins defines many unique forms of SWI/SNF and has profound functional consequences. Often this complex is studied as a single entity within a particular cell type and we understand little about the functional relationship between these biochemically distinct forms of the remodeling complex. Here we examine the functional relationships among three complex-specific ARID (AT-Rich Interacting Domain) subunits using genome-wide chromatin immunoprecipitation, transcriptome analysis, and transcription factor binding maps. We find widespread overlap in transcriptional regulation and the genomic binding of distinct SWI/SNF complexes. ARID1B and ARID2 participate in wide-spread cooperation to repress hundreds of genes. Additionally, we find numerous examples of competition between ARID1A and another ARID, and validate that gene expression changes following loss of one ARID are dependent on the function of an alternative ARID. These distinct regulatory modalities are correlated with differential occupancy by transcription factors. Together, these data suggest that distinct SWI/SNF complexes dictate gene-specific transcription through functional interactions between the different forms of the SWI/SNF complex and associated co-factors. Most genes regulated by SWI/SNF are controlled by multiple biochemically distinct forms of the complex, and the overall expression of a gene is the product of the interaction between these different SWI/SNF complexes. The three mutually exclusive ARID family members are among the most frequently mutated chromatin regulators in cancer, and understanding the functional interactions and their role in transcriptional regulation provides an important foundation to understand their role in cancer. PMID:26716708
Wenger, Yvan; Galliot, Brigitte
Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun
Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases.
Müller, Gerd A; Wintsche, Axel; Stangner, Konstanze; Prohaska, Sonja J; Stadler, Peter F; Engeland, Kurt
The cell cycle genes homology region (CHR) has been identified as a DNA element with an important role in transcriptional regulation of late cell cycle genes. It has been shown that such genes are controlled by DREAM, MMB and FOXM1-MuvB and that these protein complexes can contact DNA via CHR sites. However, it has not been elucidated which sequence variations of the canonical CHR are functional and how frequent CHR-based regulation is utilized in mammalian genomes. Here, we define the spectrum of functional CHR elements. As the basis for a computational meta-analysis, we identify new CHR sequences and compile phylogenetic motif conservation as well as genome-wide protein-DNA binding and gene expression data. We identify CHR elements in most late cell cycle genes binding DREAM, MMB, or FOXM1-MuvB. In contrast, Myb- and forkhead-binding sites are underrepresented in both early and late cell cycle genes. Our findings support a general mechanism: sequential binding of DREAM, MMB and FOXM1-MuvB complexes to late cell cycle genes requires CHR elements. Taken together, we define the group of CHR-regulated genes in mammalian genomes and provide evidence that the CHR is the central promoter element in transcriptional regulation of late cell cycle genes by DREAM, MMB and FOXM1-MuvB. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Tegnér, Jesper N.
Mapping out cellular networks in general and transcriptional networks in particular has proved to be a bottle-neck hampering our understanding of biological processes. Integrative approaches fusing computational and experimental technologies for decoding transcriptional networks at a high level of resolution is therefore of uttermost importance. Yet, this is challenging since the control of gene expression in eukaryotes is a complex multi-level process influenced by several epigenetic factors and the fine interplay between regulatory proteins and the promoter structure governing the combinatorial regulation of gene expression. In this chapter we review how the CAGE data can be integrated with other measurements such as expression, physical interactions and computational prediction of regulatory motifs, which together can provide a genome-wide picture of eukaryotic transcriptional regulatory networks at a new level of resolution. © 2010 by Pan Stanford Publishing Pte. Ltd. All rights reserved.
Antonio L C Gomes
Full Text Available ChIP-seq enables genome-scale identification of regulatory regions that govern gene expression. However, the biological insights generated from ChIP-seq analysis have been limited to predictions of binding sites and cooperative interactions. Furthermore, ChIP-seq data often poorly correlate with in vitro measurements or predicted motifs, highlighting that binding affinity alone is insufficient to explain transcription factor (TF-binding in vivo. One possibility is that binding sites are not equally accessible across the genome. A more comprehensive biophysical representation of TF-binding is required to improve our ability to understand, predict, and alter gene expression. Here, we show that genome accessibility is a key parameter that impacts TF-binding in bacteria. We developed a thermodynamic model that parameterizes ChIP-seq coverage in terms of genome accessibility and binding affinity. The role of genome accessibility is validated using a large-scale ChIP-seq dataset of the M. tuberculosis regulatory network. We find that accounting for genome accessibility led to a model that explains 63% of the ChIP-seq profile variance, while a model based in motif score alone explains only 35% of the variance. Moreover, our framework enables de novo ChIP-seq peak prediction and is useful for inferring TF-binding peaks in new experimental conditions by reducing the need for additional experiments. We observe that the genome is more accessible in intergenic regions, and that increased accessibility is positively correlated with gene expression and anti-correlated with distance to the origin of replication. Our biophysically motivated model provides a more comprehensive description of TF-binding in vivo from first principles towards a better representation of gene regulation in silico, with promising applications in systems biology.
Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A
Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in
Yi-Miao TANG; You-Zhi MA; Lian-Cheng LI; Xing-Guo YE
To clarify activation characterization of wheat (Triticum aestivum L.) retrotransposons, transcriptionally active Ty1-copia retrotransposons were found in wheat by using RT-PCR to amplify the RT domain. Sequence analysis of random RT-PCR clones reveals that Ty1-copia retrotransposons are highly heterogeneous and can be divided into at least four groups, which are tentatively named TaRT-1 to TaRT-4.Dot blot hybridization indicates that TaRT- 1 exists in the wheat genome as multiple copies (at 30 000 copies/a hexaploid genome (ABD)). Northern blot hybridization showed that TaRT-1 is only expressed at a low level under normal conditions in seedlings, but at a high level when induced by powdery mildew fungus, jasmonic acid (JA) and salicylic acid (SA). These results suggest that the TaRT-1 expression is highly sensitive to biotic and abiotic stresses.
Full Text Available Many biological processes are controlled by intricate networks of transcriptional regulators. With the development of microarray technology, transcriptional changes can be examined at the whole-genome level. However, such analysis often lacks information on the hierarchical relationship between components of a given system. Systemic acquired resistance (SAR is an inducible plant defense response involving a cascade of transcriptional events induced by salicylic acid through the transcription cofactor NPR1. To identify additional regulatory nodes in the SAR network, we performed microarray analysis on Arabidopsis plants expressing the NPR1-GR (glucocorticoid receptor fusion protein. Since nuclear translocation of NPR1-GR requires dexamethasone, we were able to control NPR1-dependent transcription and identify direct transcriptional targets of NPR1. We show that NPR1 directly upregulates the expression of eight WRKY transcription factor genes. This large family of 74 transcription factors has been implicated in various defense responses, but no specific WRKY factor has been placed in the SAR network. Identification of NPR1-regulated WRKY factors allowed us to perform in-depth genetic analysis on a small number of WRKY factors and test well-defined phenotypes of single and double mutants associated with NPR1. Among these WRKY factors we found both positive and negative regulators of SAR. This genomics-directed approach unambiguously positioned five WRKY factors in the complex transcriptional regulatory network of SAR. Our work not only discovered new transcription regulatory components in the signaling network of SAR but also demonstrated that functional studies of large gene families have to take into consideration sequence similarity as well as the expression patterns of the candidates.
Pal, Arnab; Srivastava, Tapasya; Sharma, Manish K; Mehndiratta, Mohit; Das, Prerna; Sinha, Subrata; Chattopadhyay, Parthaprasad
Hypoxia is an integral part of tumorigenesis and contributes extensively to the neoplastic phenotype including drug resistance and genomic instability. It has also been reported that hypoxia results in global demethylation. Because a majority of the cytosine-phosphate-guanine (CpG) islands are found within the repeat elements of DNA, and are usually methylated under normoxic conditions, we suggested that retrotransposable Alu or short interspersed nuclear elements (SINEs) which show altered methylation and associated changes of gene expression during hypoxia, could be associated with genomic instability. U87MG glioblastoma cells were cultured in 0.1% O₂ for 6 weeks and compared with cells cultured in 21% O₂ for the same duration. Real-time PCR analysis showed a significant increase in SINE and reverse transcriptase coding long interspersed nuclear element (LINE) transcripts during hypoxia. Sequencing of bisulphite treated DNA as well as the Combined Bisulfite Restriction Analysis (COBRA) assay showed that the SINE loci studied underwent significant hypomethylation though there was patchy hypermethylation at a few sites. The inter-alu PCR profile of DNA from cells cultured under 6-week hypoxia, its 4-week revert back to normoxia and 6-week normoxia showed several changes in the band pattern indicating increased alu mediated genomic alteration. Our results show that aberrant methylation leading to increased transcription of SINE and reverse transcriptase associated LINE elements could lead to increased genomic instability in hypoxia. This might be a cause of genetic heterogeneity in tumours especially in variegated hypoxic environment and lead to a development of foci of more aggressive tumour cells.
Matson, Eric G; Rosenthal, Adam Z; Zhang, Xinning; Leadbetter, Jared R
noncanonical amino acid selenocysteine is able to tune transcription of an important metabolic gene via translational coupling. Furthermore, a genome-wide analysis reveals that transcriptional decoupling produces a wide-ranging effect and that this effect is not uniform. These results exemplify how growth conditions that impact translational processivity can rapidly feed back on transcriptional productivity of prespecified groups of genes, providing bacteria with an efficient response to environmental changes.
Min, Kyunghun; Son, Hokyoung; Lim, Jae Yun; Choi, Gyung Ja; Kim, Jin-Cheol; Harris, Steven D; Lee, Yin-Won
The survival of cellular organisms depends on the faithful replication and transmission of DNA. Regulatory factor X (RFX) transcription factors are well conserved in animals and fungi, but their functions are diverse, ranging from the DNA damage response to ciliary gene regulation. We investigated the role of the sole RFX transcription factor, RFX1, in the plant-pathogenic fungus Fusarium graminearum. Deletion of rfx1 resulted in multiple defects in hyphal growth, conidiation, virulence, and sexual development. Deletion mutants of rfx1 were more sensitive to various types of DNA damage than the wild-type strain. Septum formation was inhibited and micronuclei were produced in the rfx1 deletion mutants. The results of the neutral comet assay demonstrated that disruption of rfx1 function caused spontaneous DNA double-strand breaks (DSBs). The transcript levels of genes involved in DNA DSB repair were upregulated in the rfx1 deletion mutants. DNA DSBs produced micronuclei and delayed septum formation in F. graminearum. Green fluorescent protein (GFP)-tagged RFX1 localized in nuclei and exhibited high expression levels in growing hyphae and conidiophores, where nuclear division was actively occurring. RNA-sequencing-based transcriptomic analysis revealed that RFX1 suppressed the expression of many genes, including those required for the repair of DNA damage. Taken together, these findings indicate that the transcriptional repressor rfx1 performs crucial roles during normal cell growth by maintaining genome integrity.
Mapping transcriptional-regulatory networks requires the identification of target genes, binding specificities and signalling pathways of transcription factors. However, the characterization of each transcription factor sufficiently for deciphering such networks remains laborious. The recent availability of overexpression and deletion strains for almost all of the transcription factor genes in the fission yeast Schizosaccharomyces pombe provides a valuable resource to better investigate transcription factors using systematic genetics. In the present paper, I review and discuss the utility of these strain collections combined with transcriptome profiling and genome-wide chromatin immunoprecipitation to identify the target genes of transcription factors.
Wang, Pengfei; Song, Hui; Li, Changsheng; Li, Pengcheng; Li, Aiqin; Guan, Hongshan; Hou, Lei; Wang, Xingjun
Heat shock transcription factors (Hsfs) are important transcription factors (TFs) in protecting plants from damages caused by various stresses. The released whole genome sequences of wild peanuts make it possible for genome-wide analysis of Hsfs in peanut. In this study, a total of 16 and 17 Hsf genes were identified from Arachis duranensis and A. ipaensis, respectively. We identified 16 orthologous Hsf gene pairs in both peanut species; however HsfXs was only identified from A. ipaensis. Orthologous pairs between two wild peanut species were highly syntenic. Based on phylogenetic relationship, peanut Hsfs were divided into groups A, B, and C. Selection pressure analysis showed that group B Hsf genes mainly underwent positive selection and group A Hsfs were affected by purifying selection. Small scale segmental and tandem duplication may play important roles in the evolution of these genes. Cis-elements, such as ABRE, DRE, and HSE, were found in the promoters of most Arachis Hsf genes. Five AdHsfs and two AiHsfs contained fungal elicitor responsive elements suggesting their involvement in response to fungi infection. These genes were differentially expressed in cultivated peanut under abiotic stress and Aspergillus flavus infection. AhHsf2 and AhHsf14 were significantly up-regulated after inoculation with A. flavus suggesting their possible role in fungal resistance.
James M. Slavicek
Genomic expression of the Lymantriu dispar multinucleocapsid nuclear polyhedrosis virus (LdMNPV) was studied. Viral specific transcripts expressed in cell culture at various times from 2 through 72 h postinfection were identified and their genomic origins mapped through Northern analysis. Sixty-five distinct transcripts were identified in this...
Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.
Ptitsyn, Andrey; Temanni, Ramzi; Bouchard, Christelle; Anderson, Peter A V
Transcriptomes are one of the first sources of high-throughput genomic data that have benefitted from the introduction of Next-Gen Sequencing. As sequencing technology becomes more accessible, transcriptome sequencing is applicable to multiple organisms for which genome sequences are unavailable. Currently all methods for de novo assembly are based on the concept of matching the nucleotide context overlapping between short fragments-reads. However, even short reads may still contain biologically relevant information which can be used as hints in guiding the assembly process. We propose a computational workflow for the reconstruction and functional annotation of expressed gene transcripts that does not require a reference genome sequence and can be tolerant to low coverage, high error rates and other issues that often lead to poor results of de novo assembly in studies of non-model organisms. We start with either raw sequences or the output of a context-based de novo transcriptome assembly. Instead of mapping reads to a reference genome or creating a completely unsupervised clustering of reads, we assemble the unknown transcriptome using nearest homologs from a public database as seeds. We consider even distant relations, indirectly linking protein-coding fragments to entire gene families in multiple distantly related genomes. The intended application of the proposed method is an additional step of semantic (based on relations between protein-coding fragments) scaffolding following traditional (i.e. based on sequence overlap) de novo assembly. The method we developed was effective in analysis of the jellyfish Cyanea capillata transcriptome and may be applicable in other studies of gene expression in species lacking a high quality reference genome sequence. Our algorithms are implemented in C and designed for parallel computation using a high-performance computer. The software is available free of charge via an open source license.
This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi
This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi
Kantún-Moreno, Nuvia; Vázquez-Euán, Roberto; Tzec-Simá, Miguel; Peraza-Echeverría, Leticia; Grijalva-Arango, Rosa; Rodríguez-García, Cecilia; James, Andrew C; Ramírez-Prado, Jorge; Islas-Flores, Ignacio; Canto-Canché, Blondy
The hemibiotrophic fungus Mycosphaerella fijiensis is the causal agent of black Sigatoka (BS), the most devastating foliar disease in banana (Musa spp.) worldwide. Little is known about genes that are important during M. fijiensis-Musa sp. interaction. The fungal cell wall is an attractive area of study because it is essential for maintenance of cellular homeostasis and it is the most external structure in the fungal cell and therefore mediates the interaction of the pathogen with the host. In this manuscript we describe the in silico identification of glycosyl phosphatidylinositol-protein (GPI) family in M. fijiensis, and the analysis of two β-1,3-glucanosyltrans-ferases (Gas), selected by homology with fungal pathogenicity factors. Potential roles in pathogenesis were evaluated through analyzing expression during different stages of black Sigatoka disease, comparing expression data with BS symptoms and fungal biomass inside leaves. Real-time quantitative RT-PCR showed nearly constant expression of MfGAS1 with slightly increases (about threefold) in conidia and at speck-necrotrophic stage during banana-pathogen interaction. Conversely, MfGAS2 expression was increased during biotrophy (about seven times) and reached a maximum at speck (about 23 times) followed by a progressive decrease in next stages, suggesting an active role in M. fijiensis pathogenesis.
Mousavi, Kambiz; Zare, Hossein; Dell'orso, Stefania;
Transcription factors and DNA regulatory binding motifs are fundamental components of the gene regulatory network. Here, by using genome-wide binding profiling, we show extensive occupancy of transcription factors of myogenesis (MyoD and Myogenin) at extragenic enhancer regions coinciding with RN...
Full Text Available BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic regions being intolerant to insertions of retroelements. The inadvertent transcriptional activity of retroelements may affect neighbouring genes, which in turn could be detrimental to an organism. We speculate that such retroelement transcription, or transcriptional interference, is a contributing factor in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs to be able to display a high degree of transcriptional interference. In contrast, we expect short interspersed elements (SINEs to display very low levels of transcriptional interference. We find that genomic regions devoid of long interspersed elements (LINEs are enriched for protein-coding genes, but that this is not the case for regions devoid of short interspersed elements (SINEs. This is expected if genes are subject to selection against transcriptional interference. We do not find microRNAs to be associated with genomic regions devoid of either SINEs or LINEs. We further observe an increased relative activity of genes overlapping LINE-free regions during early embryogenesis, where activity of LINEs has been identified previously. CONCLUSIONS/SIGNIFICANCE: Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome.
Here, we evaluate the contribution of two major biological processes—DNA replication and transcription—to mutation rate variation in human genomes. Based on analysis of the public human tissue transcriptomics data, high-resolution replicating map of Hela cells and dbSNP data, we present significant correlations between expression breadth, replication time in local regions and SNP density. SNP density of tissue-specific (TS) genes is significantly higher than that of housekeeping (HK) genes. TS genes tend to locate in late-replicating genomic regions and genes in such regions have a higher SNP density compared to those in early-replication regions. In addition, SNP density is found to be positively correlated with expression level among HK genes. We conclude that the process of DNA replication generates stronger mutational pressure than transcription-associated biological processes do, resulting in an increase of mutation rate in TS genes while having weaker effects on HK genes. In contrast, transcription-associated processes are mainly responsible for the accumulation of mutations in highly-expressed HK genes.
Blot, Nicolas; Mavathur, Ramesh; Geertz, Marcel; Travers, Andrew; Muskhelishvili, Georgi
Regulation of cellular growth implies spatiotemporally coordinated programmes of gene transcription. A central question, therefore, is how global transcription is coordinated in the genome. The growth of the unicellular organism Escherichia coli is associated with changes in both the global superhelicity modulated by cellular topoisomerase activity and the relative proportions of the abundant DNA-architectural chromatin proteins. Using a DNA-microarray-based approach that combines mutations in the genes of two important chromatin proteins with induced changes of DNA superhelicity, we demonstrate that genomic transcription is tightly associated with the spatial distribution of supercoiling sensitivity, which in turn depends on chromatin proteins. We further demonstrate that essential metabolic pathways involved in the maintenance of growth respond distinctly to changes of superhelicity. We infer that a homeostatic mechanism organizing the supercoiling sensitivity is coordinating the growth-phase-dependent transcription of the genome.
Frith, Martin C.; Valen, Eivind Dale; Krogh, Anders
that initiation events are clustered on the chromosomes at multiple scales - clusters within clusters - indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each...... of large- and small-scale effects: the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling....
Full Text Available Using high-throughput technologies, abundances and other features of genes and proteins have been measured on a genome-wide scale in Saccharomyces cerevisiae. In contrast, secondary structure in 5'-untranslated regions (UTRs of mRNA has only been investigated for a limited number of genes. Here, the aim is to study genome-wide regulatory effects of mRNA 5'-UTR folding free energies. We performed computations of secondary structures in 5'-UTRs and their folding free energies for all verified genes in S. cerevisiae. We found significant correlations between folding free energies of 5'-UTRs and various transcript features measured in genome-wide studies of yeast. In particular, mRNAs with weakly folded 5'-UTRs have higher translation rates, higher abundances of the corresponding proteins, longer half-lives, and higher numbers of transcripts, and are upregulated after heat shock. Furthermore, 5'-UTRs have significantly higher folding free energies than other genomic regions and randomized sequences. We also found a positive correlation between transcript half-life and ribosome occupancy that is more pronounced for short-lived transcripts, which supports a picture of competition between translation and degradation. Among the genes with strongly folded 5'-UTRs, there is a huge overrepresentation of uncharacterized open reading frames. Based on our analysis, we conclude that (i there is a widespread bias for 5'-UTRs to be weakly folded, (ii folding free energies of 5'-UTRs are correlated with mRNA translation and turnover on a genomic scale, and (iii transcripts with strongly folded 5'-UTRs are often rare and hard to find experimentally.
Maliszewska-Olejniczak, Kamila; Gruchota, Julita; Gromadka, Robert; Denby Wilkes, Cyril; Arnaiz, Olivier; Mathy, Nathalie; Duharcourt, Sandra; Bétermier, Mireille; Nowak, Jacek K
Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs) in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline) nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs). Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium, and establishes for
Beurden, van S.J.; Peeters, B.P.H.; Rottier, P.J.M.; Davison, A.A.; Engelsma, M.Y.
Background Whereas temporal gene expression in mammalian herpesviruses has been studied extensively, little is known about gene expression in fish herpesviruses. Here we report a genome-wide transcription analysis of a fish herpesvirus, anguillid herpesvirus 1, in cell culture, studied during the
Kouzine, Fedor; Gupta, Ashutosh; Baranello, Laura; Wojtowicz, Damian; Benaissa, Khadija; Liu, Juhong; Przytycka, Teresa M.; Levens, David
Transcription has the capacity to modify mechanically DNA topology, DNA structure, and nucleosome arrangement. Resulting from ongoing transcription, these modifications in turn, may provide instant feedback to the transcription machinery. To substantiate the connection between transcription and DNA dynamics, we charted an ENCODE map of transcription-dependent dynamic supercoiling in human Burkitt lymphoma cells using psoralen photobinding to probe DNA topology in vivo. Dynamic supercoils spread ~1.5 kb upstream of the start sites of active genes. Low and high output promoters handle this torsional stress differently as shown using inhibitors of transcription and topoisomerases, and by chromatin immunoprecipation of RNA polymerase and topoisomerases I and II. Whereas lower outputs are managed adequately by topoisomerase I, high output promoters additionally require topoisomerase II. The genome-wide coupling between transcription and DNA topology emphasizes the importance of dynamic supercoiling for gene regulation. PMID:23416947
Full Text Available Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs not encoded by annotated exons in the rice (Oryza. sativa subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83% japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.
Hua, Yingpeng; Zhou, Ting; Ding, Guangda; Yang, Qingyong; Shi, Lei; Xu, Fangsen
Allotetraploid rapeseed (Brassica napus L. AnAnCnCn, 2n=4x=38) is highly susceptible to boron (B) deficiency, a widespread limiting factor that causes severe losses in seed yield. The genetic variation in the sensitivity to B deficiency found in rapeseed genotypes emphasizes the complex response architecture. In this research, a B-inefficient genotype, ‘Westar 10’ (‘W10’), responded to B deficiencies during vegetative and reproductive development with an over-accumulation of reactive oxygen species, severe lipid peroxidation, evident plasmolysis, abnormal floral organogenesis, and widespread sterility compared to a B-efficient genotype, ‘Qingyou 10’ (‘QY10’). Whole-genome re-sequencing (WGS) of ‘QY10’ and ‘W10’ revealed a total of 1 605 747 single nucleotide polymorphisms and 218 755 insertions/deletions unevenly distributed across the allotetraploid rapeseed genome (~1130Mb). Digital gene expression (DGE) profiling identified more genes related to B transporters, antioxidant enzymes, and the maintenance of cell walls and membranes with higher transcript levels in the roots of ‘QY10’ than in ‘W10’ under B deficiency. Furthermore, based on WGS and bulked segregant analysis of the doubled haploid (DH) line pools derived from ‘QY10’ and ‘W10’, two significant quantitative trait loci (QTLs) for B efficiency were characterized on chromosome C2, and DGE-assisted QTL-seq analyses then identified a nodulin 26-like intrinsic protein gene and an ATP-binding cassette (ABC) transporter gene as the corresponding candidates regulating B efficiency. This research facilitates a more comprehensive understanding of the differential physiological and transcriptional responses to B deficiency and abundant genetic diversity in rapeseed genotypes, and the DGE-assisted QTL-seq analyses provide novel insights regarding the rapid dissection of quantitative trait genes in plant species with complex genomes. PMID:27639094
Claire V Harper
Full Text Available In individual mammalian cells the expression of some genes such as prolactin is highly variable over time and has been suggested to occur in stochastic pulses. To investigate the origins of this behavior and to understand its functional relevance, we quantitatively analyzed this variability using new mathematical tools that allowed us to reconstruct dynamic transcription rates of different reporter genes controlled by identical promoters in the same living cell. Quantitative microscopic analysis of two reporter genes, firefly luciferase and destabilized EGFP, was used to analyze the dynamics of prolactin promoter-directed gene expression in living individual clonal and primary pituitary cells over periods of up to 25 h. We quantified the time-dependence and cyclicity of the transcription pulses and estimated the length and variation of active and inactive transcription phases. We showed an average cycle period of approximately 11 h and demonstrated that while the measured time distribution of active phases agreed with commonly accepted models of transcription, the inactive phases were differently distributed and showed strong memory, with a refractory period of transcriptional inactivation close to 3 h. Cycles in transcription occurred at two distinct prolactin-promoter controlled reporter genes in the same individual clonal or primary cells. However, the timing of the cycles was independent and out-of-phase. For the first time, we have analyzed transcription dynamics from two equivalent loci in real-time in single cells. In unstimulated conditions, cells showed independent transcription dynamics at each locus. A key result from these analyses was the evidence for a minimum refractory period in the inactive-phase of transcription. The response to acute signals and the result of manipulation of histone acetylation was consistent with the hypothesis that this refractory period corresponded to a phase of chromatin remodeling which significantly
Checcucci, Alice; Mengoni, Alessio
Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.
Gaykalova, Daria A; Kulaeva, Olga I; Volokh, Olesya; Shaytan, Alexey K; Hsieh, Fu-Kai; Kirpichnikov, Mikhail P; Sokolova, Olga S; Studitsky, Vasily M
Thousands of human and Drosophila genes are regulated at the level of transcript elongation and nucleosomes are likely targets for this regulation. However, the molecular mechanisms of formation of the nucleosomal barrier to transcribing RNA polymerase II (Pol II) and nucleosome survival during/after transcription remain unknown. Here we show that both DNA-histone interactions and Pol II backtracking contribute to formation of the barrier and that nucleosome survival during transcription likely occurs through allosterically stabilized histone-histone interactions. Structural analysis indicates that after Pol II encounters the barrier, the enzyme backtracks and nucleosomal DNA recoils on the octamer, locking Pol II in the arrested state. DNA is displaced from one of the H2A/H2B dimers that remains associated with the octamer. The data reveal the importance of intranucleosomal DNA-protein and protein-protein interactions during conformational changes in the nucleosome structure on transcription. Mechanisms of nucleosomal barrier formation and nucleosome survival during transcription are proposed.
Judith Guevarra Enriquez
Full Text Available Content analysis has dominated computer-mediated communication and educational technology studies for some time, and a review of its practices applied to online corpus of data or messages is overdue. We are confronted with complexity given the various foci, nuances and models for theorising learning and applying methods. One common suggestion to deal with the complexity in content analysis is a call for standardisation by replication or systematic research studies. This article presents its ‘discontent' with content analysis, discussing the issues and concerns that surround the analysis of online transcripts. It does not attempt to resolve nor provide a definitive answer. Instead, it is an open inquiry into another way of looking at online content. It presents an alternative or perhaps an extension of what we have come to know as content analysis. It argues for the notion of genres as another way of conceptualising online transcripts. It proposes two things: first that in performing transcript analysis, it is worthwhile to think how messages relate to a system of interactions that persists even beyond the online environment; secondly, there is an emergent and recurring metastructuring that is at work in online environments that is worth exploring, instead of imposing structures – models and frameworks that do not fit the emerging communicative practices of participants.
Full Text Available Genomic islands (GIs, frequently associated with the pathogenicity of bacteria and having a substantial influence on bacterial evolution, are groups of "alien" elements which probably undergo special temporal-spatial regulation in the host genome. Are there particular hallmark transcriptional signals for these "exotic" regions? We here explore the potential transcriptional signals that underline the GIs beyond the conventional views on basic sequence composition, such as codon usage and GC property bias. It showed that there is a significant enrichment of the transcription start positions (TSPs in the GI regions compared to the whole genome of Salmonella enterica and Escherichia coli. There was up to a four-fold increase for the 70% GIs, implying high-density TSPs profile can potentially differentiate the GI regions. Based on this feature, we developed a new sliding window method GIST, Genomic-island Identification by Signals of Transcription, to identify these regions. Subsequently, we compared the known GI-associated features of the GIs detected by GIST and by the existing method Islandviewer to those of the whole genome. Our method demonstrates high sensitivity in detecting GIs harboring genes with biased GI-like function, preferred subcellular localization, skewed GC property, shorter gene length and biased "non-optimal" codon usage. The special transcriptional signals discovered here may contribute to the coordinate expression regulation of foreign genes. Finally, by using GIST, we detected many interesting GIs in the 2011 German E. coli O104:H4 outbreak strain TY-2482, including the microcin H47 system and gene cluster ycgXEFZ-ymgABC that activates the production of biofilm matrix. The aforesaid findings highlight the power of GIST to predict GIs with distinct intrinsic features to the genome. The heterogeneity of cumulative TSPs profiles may not only be a better identity for "alien" regions, but also provide hints to the special
Berger, Michael; Farcas, Anca; Geertz, Marcel; Zhelyazkova, Petya; Brix, Klaudia; Travers, Andrew; Muskhelishvili, Georgi
The histone-like protein HU is a highly abundant DNA architectural protein that is involved in compacting the DNA of the bacterial nucleoid and in regulating the main DNA transactions, including gene transcription. However, the coordination of the genomic structure and function by HU is poorly understood. Here, we address this question by comparing transcript patterns and spatial distributions of RNA polymerase in Escherichia coli wild-type and hupA/B mutant cells. We demonstrate that, in mutant cells, upregulated genes are preferentially clustered in a large chromosomal domain comprising the ribosomal RNA operons organized on both sides of OriC. Furthermore, we show that, in parallel to this transcription asymmetry, mutant cells are also impaired in forming the transcription foci-spatially confined aggregations of RNA polymerase molecules transcribing strong ribosomal RNA operons. Our data thus implicate HU in coordinating the global genomic structure and function by regulating the spatial distribution of RNA polymerase in the nucleoid.
Grdzelishvili, Valery Z.; Garcia-Ruiz, Hernan; Watanabe, Tokiko; Ahlquist, Paul
Replication by many positive-strand RNA viruses includes genomic RNA amplification and subgenomic mRNA (sgRNA) transcription. For brome mosaic virus (BMV), both processes occur in virus-induced, membrane-associated compartments, require BMV replication factors 1a and 2a, and use negative-strand RNA3 as a template for genomic RNA3 and sgRNA syntheses. To begin elucidating their relations, we examined the interaction of RNA3 replication and sgRNA transcription in Saccharomyces cerevisiae expres...
Carter, Mark G.; Sharov, Alexei A; VanBuren, Vincent; Dudekula, Dawood B.; Carmack, Condie E; Nelson, Charlie; Ko, Minoru SH
The ability to quantitatively measure the expression of all genes in a given tissue or cell with a single assay is an exciting promise of gene-expression profiling technology. An in situ-synthesized 60-mer oligonucleotide microarray designed to detect transcripts from all mouse genes was validated, as well as a set of exogenous RNA controls derived from the yeast genome (made freely available without restriction), which allow quantitative estimation of absolute endogenous transcript abundance.
Carter, Mark G; Sharov, Alexei A; VanBuren, Vincent; Dudekula, Dawood B; Carmack, Condie E; Nelson, Charlie; Ko, Minoru SH
The ability to quantitatively measure the expression of all genes in a given tissue or cell with a single assay is an exciting promise of gene-expression profiling technology. An in situ-synthesized 60-mer oligonucleotide microarray designed to detect transcripts from all mouse genes was validated, as well as a set of exogenous RNA controls derived from the yeast genome (made freely available without restriction), which allow quantitative estimation of absolute endogenous transcript abundance. PMID:15998450
Full Text Available Abstract Background During the development of the Drosophila central nervous system the process of midline crossing is orchestrated by a number of guidance receptors and ligands. Many key axon guidance molecules have been identified in both invertebrates and vertebrates, but the transcriptional regulation of growth cone guidance remains largely unknown. It is established that translational regulation plays a role in midline crossing, and there are indications that transcriptional regulation is also involved. To investigate this issue, we conducted a genome-wide study of transcription in Drosophila embryos using wild type and a number of well-characterized Drosophila guidance mutants and transgenics. We also analyzed a previously published microarray time course of Drosophila embryonic development with an axon guidance focus. Results Using hopach, a novel clustering method which is well suited to microarray data analysis, we identified groups of genes with similar expression patterns across guidance mutants and transgenics. We then systematically characterized the resulting clusters with respect to their relevance to axon guidance using two complementary controlled vocabularies: the Gene Ontology (GO and anatomical annotations of the Atlas of Pattern of Gene Expression (APoGE in situ hybridization database. The analysis indicates that regulation of gene expression does play a role in the process of axon guidance in Drosophila. We also find a strong link between axon guidance and hemocyte migration, a result that agrees with mounting evidence that axon guidance molecules are co-opted in vertebrate vascularization. Cell cyclin activity in the context of axon guidance is also suggested from our array data. RNA and protein expression patterns of cell cyclins in axon guidance mutants and transgenics support this possible link. Conclusion This study provides important insights into the regulation of axon guidance in vivo.
Hsiung, Chris C-S; Bartman, Caroline R; Huang, Peng; Ginart, Paul; Stonestrom, Aaron J; Keller, Cheryl A; Face, Carolyne; Jahn, Kristen S; Evans, Perry; Sankaranarayanan, Laavanya; Giardine, Belinda; Hardison, Ross C; Raj, Arjun; Blobel, Gerd A
During mitosis, RNA polymerase II (Pol II) and many transcription factors dissociate from chromatin, and transcription ceases globally. Transcription is known to restart in bulk by telophase, but whether de novo transcription at the mitosis-G1 transition is in any way distinct from later in interphase remains unknown. We tracked Pol II occupancy genome-wide in mammalian cells progressing from mitosis through late G1. Unexpectedly, during the earliest rounds of transcription at the mitosis-G1 transition, ∼50% of active genes and distal enhancers exhibit a spike in transcription, exceeding levels observed later in G1 phase. Enhancer-promoter chromatin contacts are depleted during mitosis and restored rapidly upon G1 entry but do not spike. Of the chromatin-associated features examined, histone H3 Lys27 acetylation levels at individual loci in mitosis best predict the mitosis-G1 transcriptional spike. Single-molecule RNA imaging supports that the mitosis-G1 transcriptional spike can constitute the maximum transcriptional activity per DNA copy throughout the cell division cycle. The transcriptional spike occurs heterogeneously and propagates to cell-to-cell differences in mature mRNA expression. Our results raise the possibility that passage through the mitosis-G1 transition might predispose cells to diverge in gene expression states.
Erwin P Gianchandani
Full Text Available A transcriptional regulatory network (TRN constitutes the collection of regulatory rules that link environmental cues to the transcription state of a cell's genome. We recently proposed a matrix formalism that quantitatively represents a system of such rules (a transcriptional regulatory system [TRS] and allows systemic characterization of TRS properties. The matrix formalism not only allows the computation of the transcription state of the genome but also the fundamental characterization of the input-output mapping that it represents. Furthermore, a key advantage of this "pseudo-stoichiometric" matrix formalism is its ability to easily integrate with existing stoichiometric matrix representations of signaling and metabolic networks. Here we demonstrate for the first time how this matrix formalism is extendable to large-scale systems by applying it to the genome-scale Escherichia coli TRS. We analyze the fundamental subspaces of the regulatory network matrix (R to describe intrinsic properties of the TRS. We further use Monte Carlo sampling to evaluate the E. coli transcription state across a subset of all possible environments, comparing our results to published gene expression data as validation. Finally, we present novel in silico findings for the E. coli TRS, including (1 a gene expression correlation matrix delineating functional motifs; (2 sets of gene ontologies for which regulatory rules governing gene transcription are poorly understood and which may direct further experimental characterization; and (3 the appearance of a distributed TRN structure, which is in stark contrast to the more hierarchical organization of metabolic networks.
Gianchandani, Erwin P; Joyce, Andrew R; Palsson, Bernhard Ø; Papin, Jason A
A transcriptional regulatory network (TRN) constitutes the collection of regulatory rules that link environmental cues to the transcription state of a cell's genome. We recently proposed a matrix formalism that quantitatively represents a system of such rules (a transcriptional regulatory system [TRS]) and allows systemic characterization of TRS properties. The matrix formalism not only allows the computation of the transcription state of the genome but also the fundamental characterization of the input-output mapping that it represents. Furthermore, a key advantage of this "pseudo-stoichiometric" matrix formalism is its ability to easily integrate with existing stoichiometric matrix representations of signaling and metabolic networks. Here we demonstrate for the first time how this matrix formalism is extendable to large-scale systems by applying it to the genome-scale Escherichia coli TRS. We analyze the fundamental subspaces of the regulatory network matrix (R) to describe intrinsic properties of the TRS. We further use Monte Carlo sampling to evaluate the E. coli transcription state across a subset of all possible environments, comparing our results to published gene expression data as validation. Finally, we present novel in silico findings for the E. coli TRS, including (1) a gene expression correlation matrix delineating functional motifs; (2) sets of gene ontologies for which regulatory rules governing gene transcription are poorly understood and which may direct further experimental characterization; and (3) the appearance of a distributed TRN structure, which is in stark contrast to the more hierarchical organization of metabolic networks.
Ui, Ayako; Yasui, Akira
Polycomb group (PcG) repress, whereas Trithorax group (TrxG) activate transcription for tissue development and cellular proliferation, and misregulation of these factors is often associated with cancer. ENL (MLLT1) and AF9 (MLLT3) are fusion partners of Mixed Lineage Leukemia (MLL), TrxG proteins, and are factors in Super Elongation Complex (SEC). SEC controls transcriptional elongation to release RNA polymerase II, paused around transcription start site. In MLL rearranged leukemia, several components of SEC have been found as MLL-fusion partners and the control of transcriptional elongation is misregulated leading to tumorigenesis in MLL-SEC fused Leukemia. It has been suggested that unexpected collaboration of ENL/AF9-MLL and PcG are involved in tumorigenesis in leukemia. Recently, we found that the collaboration of ENL/AF9 and PcG led to a novel mechanism of transcriptional switch from elongation to repression under ATM-signaling for genome integrity. Activated ATM phosphorylates ENL/AF9 in SEC, and the phosphorylated ENL/AF9 binds BMI1 and RING1B, a heterodimeric E3-ubiquitin-ligase complex in Polycomb Repressive complex 1 (PRC1), and recruits PRC1 at transcriptional elongation sites to rapidly repress transcription. The ENL/AF9 in SEC- and PcG-mediated transcriptional repression promotes DSB repair near transcription sites. The implication of this is that the collaboration of ENL/AF9 in SEC and PcG ensures a rapid response of transcriptional switching from elongation to repression to neighboring genotoxic stresses for DSB repair. Therefore, these results suggested that the collaboration of ENL/AF9 and PcG in transcriptional control is required to maintain genome integrity and may be link to the MLL-ENL/AF9 leukemia.
Analysis of phage Mu DNA transposition by whole-genome Escherichia coli tiling arrays reveals a complex relationship to distribution of target selection protein B, transcription and chromosome architectural elements
Jun Ge; Zheng Lou; Hong Cui; Lei Shang; Rasika M Harshey
Of all known transposable elements, phage Mu exhibits the highest transposition efficiency and the lowest target specificity. In vitro, MuB protein is responsible for target choice. In this work, we provide a comprehensive assessment of the genome-wide distribution of MuB and its relationship to Mu target selection using high-resolution Escherichia coli tiling DNA arrays. We have also assessed how MuB binding and Mu transposition are influenced by chromosome-organizing elements such as AT-rich DNA signatures, or the binding of the nucleoid-associated protein Fis, or processes such as transcription. The results confirm and extend previous biochemical and lower resolution in vivo data. Despite the generally random nature of Mu transposition and MuB binding, there were hot and cold insertion sites and MuB binding sites in the genome, and differences between the hottest and coldest sites were large. The new data also suggest that MuB distribution and subsequent Mu integration is responsive to DNA sequences that contribute to the structural organization of the chromosome.
Saville Barry J
Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619
Full Text Available Cells respond to environmental and/or genetic perturbations in order to survive and proliferate. Characterization of the changes after various stimuli at different -omics levels is crucial to comprehend the adaptation of cells to changing conditions. Genome wide quantification and analysis of transcript levels, the genes affected by perturbations, extends our understanding of cellular metabolism by pointing out the mechanisms that play role in sensing the stress caused by those perturbations and related signaling pathways, and in this way guides us to achieve endeavors such as rational engineering of cells or interpretation of disease mechanisms. Saccharomyces cerevisiae as a model system has been studied in response to different perturbations and corresponding transcriptional profiles were followed either statically or/and dynamically, short- and long- term. This review focuses on response of yeast cells to diverse stress inducing perturbations including nutritional changes, ionic stress, salt stress, oxidative stress, osmotic shock, as well as to genetic interventions such as deletion and over-expression of genes. It is aimed to conclude on common regulatory phenomena that allow yeast to organize its transcriptomic response after any perturbation under different external conditions.
Vesely, C; Frech, C; Eckert, C; Cario, G; Mecklenbräuker, A; zur Stadt, U; Nebral, K; Kraler, F; Fischer, S; Attarbaschi, A; Schuster, M; Bock, C; Cavé, H; von Stackelberg, A; Schrappe, M; Horstmann, M A; Mann, G; Haas, O A; Panzer-Grümayer, R
Children with P2RY8-CRLF2-positive acute lymphoblastic leukemia have an increased relapse risk. Their mutational and transcriptional landscape, as well as the respective patterns at relapse remain largely elusive. We, therefore, performed an integrated analysis of whole-exome and RNA sequencing in 41 major clone fusion-positive cases including 19 matched diagnosis/relapse pairs. We detected a variety of frequently subclonal and highly instable JAK/STAT but also RTK/Ras pathway-activating mutations in 76% of cases at diagnosis and virtually all relapses. Unlike P2RY8-CRLF2 that was lost in 32% of relapses, all other genomic alterations affecting lymphoid development (58%) and cell cycle (39%) remained stable. Only IKZF1 alterations predominated in relapsing cases (P=0.001) and increased from initially 36 to 58% in matched cases. IKZF1’s critical role is further corroborated by its specific transcriptional signature comprising stem cell features with signs of impaired lymphoid differentiation, enhanced focal adhesion, activated hypoxia pathway, deregulated cell cycle and increased drug resistance. Our findings support the notion that P2RY8-CRLF2 is dispensable for relapse development and instead highlight the prominent rank of IKZF1 for relapse development by mediating self-renewal and homing to the bone marrow niche. Consequently, reverting aberrant IKAROS signaling or its disparate programs emerges as an attractive potential treatment option in these leukemias. PMID:27899802
Full Text Available A central challenge in genetics is to understand when and why mutations alter the phenotype of an organism. The consequences of gene inhibition have been systematically studied and can be predicted reasonably well across a genome. However, many sequence variants important for disease and evolution may alter gene regulation rather than gene function. The consequences of altering a regulatory interaction (or "edge" rather than a gene (or "node" in a network have not been as extensively studied. Here we use an integrative analysis and evolutionary conservation to identify features that predict when the loss of a regulatory interaction is detrimental in the extensively mapped transcription network of budding yeast. Properties such as the strength of an interaction, location and context in a promoter, regulator and target gene importance, and the potential for compensation (redundancy associate to some extent with interaction importance. Combined, however, these features predict quite well whether the loss of a regulatory interaction is detrimental across many promoters and for many different transcription factors. Thus, despite the potential for regulatory diversity, common principles can be used to understand and predict when changes in regulation are most harmful to an organism.
Kowalko, Johanna E; Ma, Li; Jeffery, William R
Identifying alleles of genes underlying evolutionary change is essential to understanding how and why evolution occurs. Towards this end, much recent work has focused on identifying candidate genes for the evolution of traits in a variety of species. However, until recently it has been challenging to functionally validate interesting candidate genes. Recently developed tools for genetic engineering make it possible to manipulate specific genes in a wide range of organisms. Application of this technology in evolutionarily relevant organisms will allow for unprecedented insight into the role of candidate genes in evolution. Astyanax mexicanus (A. mexicanus) is a species of fish with both surface-dwelling and cave-dwelling forms. Multiple independent lines of cave-dwelling forms have evolved from ancestral surface fish, which are interfertile with one another and with surface fish, allowing elucidation of the genetic basis of cave traits. A. mexicanus has been used for a number of evolutionary studies, including linkage analysis to identify candidate genes responsible for a number of traits. Thus, A. mexicanus is an ideal system for the application of genome editing to test the role of candidate genes. Here we report a method for using transcription activator-like effector nucleases (TALENs) to mutate genes in surface A. mexicanus. Genome editing using TALENs in A. mexicanus has been utilized to generate mutations in pigmentation genes. This technique can also be utilized to evaluate the role of candidate genes for a number of other traits that have evolved in cave forms of A. mexicanus.
Patrick J. Fahy
Full Text Available Computer conferencing is one of the more useful parts of computer-mediated communications (CMC, and is virtually ubiquitous in distance education. The temptation to analyze the resulting interaction has resulted in only partial success, however (Henri, 1992; Kanuka and Anderson, 1998; Rourke, Anderson, Garrison and Archer, 1999; Fahy, Crawford, Ally, Cookson, Keller and Prosser, 2000. Some suggest the problem is made more complex by failings of both technique and, more seriously, theory capable of guiding transcript analysis research (Gunawardena, Lowe and Anderson, 1997.We have previously described development and pilot-testing of an instrument and a process for transcript analysis, call the the TAT (Transcript Analysis Tool, based on a model originally developed by Zhu (1996. We found that the instrument and coding procedures used provided acceptable – sometimes excellent – levels of interrater reliability (varying from 70 percent to 94 percent in pilot applications, depending upon user training and practice with the instrument, and that results of pilots indicated the TAT discriminated well among the various types of statements found in online conferences (Fahy, et al., 2000.
Jose L. Pruneda-Paz
Full Text Available Extensive transcriptional networks play major roles in cellular and organismal functions. Transcript levels are in part determined by the combinatorial and overlapping functions of multiple transcription factors (TFs bound to gene promoters. Thus, TF-promoter interactions provide the basic molecular wiring of transcriptional regulatory networks. In plants, discovery of the functional roles of TFs is limited by an increased complexity of network circuitry due to a significant expansion of TF families. Here, we present the construction of a comprehensive collection of Arabidopsis TFs clones created to provide a versatile resource for uncovering TF biological functions. We leveraged this collection by implementing a high-throughput DNA binding assay and identified direct regulators of a key clock gene (CCA1 that provide molecular links between different signaling modules and the circadian clock. The resources introduced in this work will significantly contribute to a better understanding of the transcriptional regulatory landscape of plant genomes.
Loots, G; Ovcharenko, I
Evolutionary conservation of DNA sequences provides a tool for the identification of functional elements in genomes. We have created a database of evolutionary conserved regions (ECRs) in vertebrate genomes entitled ECRbase that is constructed from a collection of pairwise vertebrate genome alignments produced by the ECR Browser database. ECRbase features a database of syntenic blocks that recapitulate the evolution of rearrangements in vertebrates and a collection of promoters in all vertebrate genomes presented in the database. The database also contains a collection of annotated transcription factor binding sites (TFBS) in all ECRs and promoter elements. ECRbase currently includes human, rhesus macaque, dog, opossum, rat, mouse, chicken, frog, zebrafish, and two pufferfish genomes. It is freely accessible at http://ECRbase.dcode.org.
Bro, Christoffer; Regenberg, Birgitte; Nielsen, Jens
The genome-wide transcriptional response of a Saccharomyces cerevisiae strain deleted in GDH1 that encodes a NADP(+)-dependent glutamate dehydrogenase was compared to a wild-type strain under anaerobic steady-state conditions. The GDH1-deleted strain has a significantly reduced NADPH requirement...
Christina L. Zheng
Full Text Available Somatic mutations in cancer are more frequent in heterochromatic and late-replicating regions of the genome. We report that regional disparities in mutation density are virtually abolished within transcriptionally silent genomic regions of cutaneous squamous cell carcinomas (cSCCs arising in an XPC−/− background. XPC−/− cells lack global genome nucleotide excision repair (GG-NER, thus establishing differential access of DNA repair machinery within chromatin-rich regions of the genome as the primary cause for the regional disparity. Strikingly, we find that increasing levels of transcription reduce mutation prevalence on both strands of gene bodies embedded within H3K9me3-dense regions, and only to those levels observed in H3K9me3-sparse regions, also in an XPC-dependent manner. Therefore, transcription appears to reduce mutation prevalence specifically by relieving the constraints imposed by chromatin structure on DNA repair. We model this relationship among transcription, chromatin state, and DNA repair, revealing a new, personalized determinant of cancer risk.
Wang, Yun E.; Marinov, Georgi K.; Wold, Barbara J.; Chan, David C.
Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcriptio...
Full Text Available Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs. Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium
Hamperl, Stephan; Cimprich, Karlene A
The complex machineries involved in replication and transcription translocate along the same DNA template, often in opposing directions and at different rates. These processes routinely interfere with each other in prokaryotes, and mounting evidence now suggests that RNA polymerase complexes also encounter replication forks in higher eukaryotes. Indeed, cells rely on numerous mechanisms to avoid, tolerate, and resolve such transcription-replication conflicts, and the absence of these mechanisms can lead to catastrophic effects on genome stability and cell viability. In this article, we review the cellular responses to transcription-replication conflicts and highlight how these inevitable encounters shape the genome and impact diverse cellular processes. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Haakonsson, Anders Kristian; Madsen, Maria Stahl; Nielsen, Ronni
Peroxisome proliferator-activated receptor γ (PPARγ) is a master regulator of adipocyte differentiation, and genome-wide studies indicate that it is involved in the induction of most adipocyte genes. Here we report, for the first time, the acute effects of the synthetic PPARγ agonist rosiglitazone...... on the transcriptional network of PPARγ in adipocytes. Treatment with rosiglitazone for 1 hour leads to acute transcriptional activation as well as repression of a number of genes as determined by genome-wide RNA polymerase II occupancy. Unlike what has been shown for many other nuclear receptors, agonist treatment does...... not lead to major changes in the occurrence of PPARγ binding sites. However, rosiglitazone promotes PPARγ occupancy at many preexisting sites, and this is paralleled by increased occupancy of the mediator subunit MED1. The increase in PPARγ and MED1 binding is correlated with an increase in transcription...
Herington Adrian C
Full Text Available Abstract Background Ghrelin is a multifunctional peptide hormone expressed in a range of normal tissues and pathologies. It has been reported that the human ghrelin gene consists of five exons which span 5 kb of genomic DNA on chromosome 3 and includes a 20 bp non-coding first exon (20 bp exon 0. The availability of bioinformatic tools enabling comparative analysis and the finalisation of the human genome prompted us to re-examine the genomic structure of the ghrelin locus. Results We have demonstrated the presence of an additional novel exon (exon -1 and 5' extensions to exon 0 and 1 using comparative in silico analysis and have demonstrated their existence experimentally using RT-PCR and 5' RACE. A revised exon-intron structure demonstrates that the human ghrelin gene spans 7.2 kb and consists of six rather than five exons. Several ghrelin gene-derived splice forms were detected in a range of human tissues and cell lines. We have demonstrated ghrelin gene-derived mRNA transcripts that do not code for ghrelin, but instead may encode the C-terminal region of full-length preproghrelin (C-ghrelin, which contains the coding region for obestatin and a transcript encoding obestatin-only. Splice variants that differed in their 5' untranslated regions were also found, suggesting a role of these regions in the post-transcriptional regulation of preproghrelin translation. Finally, several natural antisense transcripts, termed ghrelinOS (ghrelin opposite strand transcripts, were demonstrated via orientation-specific RT-PCR, 5' RACE and in silico analysis of ESTs and cloned amplicons. Conclusion The sense and antisense alternative transcripts demonstrated in this study may function as non-coding regulatory RNA, or code for novel protein isoforms. This is the first demonstration of putative obestatin and C-ghrelin specific transcripts and these findings suggest that these ghrelin gene-derived peptides may also be produced independently of preproghrelin
Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.
Dal Ri Antonio
Full Text Available Abstract Background Grapevine (Vitis species is among the most important fruit crops in terms of cultivated area and economic impact. Despite this relevance, little is known about the transcriptional changes and the regulatory circuits underlying the biochemical and physical changes occurring during berry development. Results Fruit ripening in the non-climacteric crop species Vitis vinifera L. has been investigated at the transcriptional level by the use of the Affymetrix Vitis GeneChip® which contains approximately 14,500 unigenes. Gene expression data obtained from berries sampled before and after véraison in three growing years, were analyzed to identify genes specifically involved in fruit ripening and to investigate seasonal influences on the process. From these analyses a core set of 1477 genes was found which was similarly modulated in all seasons. We were able to separate ripening specific isoforms within gene families and to identify ripening related genes which appeared strongly regulated also by the seasonal weather conditions. Transcripts annotation by Gene Ontology vocabulary revealed five overrepresented functional categories of which cell wall organization and biogenesis, carbohydrate and secondary metabolisms and stress response were specifically induced during the ripening phase, while photosynthesis was strongly repressed. About 19% of the core gene set was characterized by genes involved in regulatory processes, such as transcription factors and transcripts related to hormonal metabolism and signal transduction. Auxin, ethylene and light emerged as the main stimuli influencing berry development. In addition, an oxidative burst, previously not detected in grapevine, characterized by rapid accumulation of H2O2 starting from véraison and by the modulation of many ROS scavenging enzymes, was observed. Conclusion The time-course gene expression analysis of grapevine berry development has identified the occurrence of two well
Full Text Available The anti-cancer drug camptothecin inhibits replication and transcription by trapping DNA topoisomerase I (Top1 covalently to DNA in a "cleavable complex". To examine the effects of camptothecin on RNA synthesis genome-wide we used Bru-Seq and show that camptothecin treatment primarily affected transcription elongation. We also observed that camptothecin increased RNA reads past transcription termination sites as well as at enhancer elements. Following removal of camptothecin, transcription spread as a wave from the 5'-end of genes with no recovery of transcription apparent from RNA polymerases stalled in the body of genes. As a result, camptothecin preferentially inhibited the expression of large genes such as proto-oncogenes, and anti-apoptotic genes while smaller ribosomal protein genes, pro-apoptotic genes and p53 target genes showed relative higher expression. Cockayne syndrome group B fibroblasts (CS-B, which are defective in transcription-coupled repair (TCR, showed an RNA synthesis recovery profile similar to normal fibroblasts suggesting that TCR is not involved in the repair of or RNA synthesis recovery from transcription-blocking Top1 lesions. These findings of the effects of camptothecin on transcription have important implications for its anti-cancer activities and may aid in the design of improved combinatorial treatments involving Top1 poisons.
Matthew J. Bush
Full Text Available WhiB is the founding member of a family of proteins (the WhiB-like [Wbl] family that carry a [4Fe-4S] iron-sulfur cluster and play key roles in diverse aspects of the biology of actinomycetes, including pathogenesis, antibiotic resistance, and the control of development. In Streptomyces, WhiB is essential for the process of developmentally controlled cell division that leads to sporulation. The biochemical function of Wbl proteins has been controversial; here, we set out to determine unambiguously if WhiB functions as a transcription factor using chromatin immunoprecipitation sequencing (ChIP-seq in Streptomyces venezuelae. In the first demonstration of in vivo genome-wide Wbl binding, we showed that WhiB regulates the expression of key genes required for sporulation by binding upstream of ~240 transcription units. Strikingly, the WhiB regulon is identical to the previously characterized WhiA regulon, providing an explanation for the identical phenotypes of whiA and whiB mutants. Using ChIP-seq, we demonstrated that in vivo DNA binding by WhiA depends on WhiB and vice versa, showing that WhiA and WhiB function cooperatively to control expression of a common set of WhiAB target genes. Finally, we show that mutation of the cysteine residues that coordinate the [4Fe-4S] cluster in WhiB prevents DNA binding by both WhiB and WhiA in vivo.
Xiang Jin; Qin Li; Guanghui Xiao; Yu-Xian Zhu
We assembled a total of 297,239 Gossypium hirsutum (Gh,a tetraploid cotton,AADD) expressed sequence tag (EST) sequences that were available in the National Center for Biotechnology Information database,with reference to the recently published G.raimondii (Gr,a diploid cotton,DD) genome,and obtained 49,125 UniGenes.The average lengths of the UniGenes were increased from 804 and 791 bp in two previous EST assemblies to 1,019 bp in the current analysis.The number of putative cotton UniGenes with lengths of 3 kb or more increased from 25 or 34 to 1,223.As a result,thousands of originally independent G.hirsutum ESTs were aligned to produce large contigs encoding transcripts with very long open reading frames,indicating that the G.raimondii genome sequence provided remarkable advantages to assemble the tetraploid cotton transcriptome.Significant different distribution patterns within several GO terms,including transcription factor activity,were observed between D-and A-derived assemblies.Transcriptome analysis showed that,in a tetraploid cotton cell,29,547 UniGenes were possibly derived from the D subgenome while another 19,578 may come from the A subgenome.Finally,some of the in silico data were confirmed by reverse transcription polymerase chain reaction experiments to show the changes in transcript levels for several gene families known to play key role in cotton fiber development.We believe that our work provides a useful platform for functional and evolutionary genomic studies in cotton.
Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi
The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.
To better understand genome regulation, it is important to uncover the role of transcription factors in the process of chromatin structure establishment and maintenance. Here we present a data-driven approach to systematically characterise transcription factors that are relevant for this process. Our method uses a linear mixed modelling approach to combine datasets of transcription factor binding motif enrichments in open chromatin and gene expression across the same set of cell lines. Applying this approach to the ENCODE dataset, we confirm already known and imply numerous novel transcription factors that play a role in the establishment or maintenance of open chromatin. In particular, our approach rediscovers many factors that have been annotated as pioneer factors. PMID:28118358
Maria E Figueroa
Full Text Available The molecular heterogeneity of acute leukemias and other tumors constitutes a major obstacle towards understanding disease pathogenesis and developing new targeted-therapies. Aberrant gene regulation is a hallmark of cancer and plays a central role in determining tumor phenotype. We predicted that integration of different genome-wide epigenetic regulatory marks along with gene expression levels would provide greater power in capturing biological differences between leukemia subtypes. Gene expression, cytosine methylation and histone H3 lysine 9 (H3K9 acetylation were measured using high-density oligonucleotide microarrays in primary human acute myeloid leukemia (AML and acute lymphocytic leukemia (ALL specimens. We found that DNA methylation and H3K9 acetylation distinguished these leukemias of distinct cell lineage, as expected, but that an integrative analysis combining the information from each platform revealed hundreds of additional differentially expressed genes that were missed by gene expression arrays alone. This integrated analysis also enhanced the detection and statistical significance of biological pathways dysregulated in AML and ALL. Integrative epigenomic studies are thus feasible using clinical samples and provide superior detection of aberrant transcriptional programming than single-platform microarray studies.
Yousry Y Azmy
Full Text Available Understanding how a myriad of transcription regulators work to modulate mRNA output at thousands of genes remains a fundamental challenge in molecular biology. Here we develop a computational tool to aid in assessing the plausibility of gene regulatory models derived from genome-wide expression profiling of cells mutant for transcription regulators. mRNA output is modelled as fluid flow in a pipe lattice, with assembly of the transcription machinery represented by the effect of valves. Transcriptional regulators are represented as external pressure heads that determine flow rate. Modelling mutations in regulatory proteins is achieved by adjusting valves' on/off settings. The topology of the lattice is designed by the experimentalist to resemble the expected interconnection between the modelled agents and their influence on mRNA expression. Users can compare multiple lattice configurations so as to find the one that minimizes the error with experimental data. This computational model provides a means to test the plausibility of transcription regulation models derived from large genomic data sets.
Cogburn, L A; Wang, X; Carre, W; Rejto, L; Aggrey, S E; Duclos, M J; Simon, J; Porter, T E
The genetic networks that govern the differentiation and growth of major tissues of economic importance in the chicken are largely unknown. Under a functional genomics project, our consortium has generated 30 609 expressed sequence tags (ESTs) and developed several chicken DNA microarrays, which represent the Chicken Metabolic/Somatic (10 K) and Neuroendocrine/Reproductive (8 K) Systems (http://udgenome.ags.udel.edu/cogburn/). One of the major challenges facing functional genomics is the development of mathematical models to reconstruct functional gene networks and regulatory pathways from vast volumes of microarray data. In initial studies with liver-specific microarrays (3.1 K), we have examined gene expression profiles in liver during the peri-hatch transition and during a strong metabolic perturbation-fasting and re-feeding-in divergently selected broiler chickens (fast vs. slow-growth lines). The expression of many genes controlling metabolic pathways is dramatically altered by these perturbations. Our analysis has revealed a large number of clusters of functionally related genes (mainly metabolic enzymes and transcription factors) that control major metabolic pathways. Currently, we are conducting transcriptional profiling studies of multiple tissues during development of two sets of divergently selected broiler chickens (fast vs. slow growing and fat vs. lean lines). Transcriptional profiling across multiple tissues should permit construction of a detailed genetic blueprint that illustrates the developmental events and hierarchy of genes that govern growth and development of chickens. This review will briefly describe the recent acquisition of chicken genomic resources (ESTs and microarrays) and our consortium's efforts to help launch the new era of functional genomics in the chicken.
Physiological, biochemical, and genome-wide transcriptional analysis reveals that elevated CO2 mitigates the impact of combined heat wave and drought stress in Arabidopsis thaliana at multiple organizational levels.
Zinta, Gaurav; AbdElgawad, Hamada; Domagalska, Malgorzata A; Vergauwen, Lucia; Knapen, Dries; Nijs, Ivan; Janssens, Ivan A; Beemster, Gerrit T S; Asard, Han
Climate changes increasingly threaten plant growth and productivity. Such changes are complex and involve multiple environmental factors, including rising CO2 levels and climate extreme events. As the molecular and physiological mechanisms underlying plant responses to realistic future climate extreme conditions are still poorly understood, a multiple organizational level analysis (i.e. eco-physiological, biochemical, and transcriptional) was performed, using Arabidopsis exposed to incremental heat wave and water deficit under ambient and elevated CO2 . The climate extreme resulted in biomass reduction, photosynthesis inhibition, and considerable increases in stress parameters. Photosynthesis was a major target as demonstrated at the physiological and transcriptional levels. In contrast, the climate extreme treatment induced a protective effect on oxidative membrane damage, most likely as a result of strongly increased lipophilic antioxidants and membrane-protecting enzymes. Elevated CO2 significantly mitigated the negative impact of a combined heat and drought, as apparent in biomass reduction, photosynthesis inhibition, chlorophyll fluorescence decline, H2 O2 production, and protein oxidation. Analysis of enzymatic and molecular antioxidants revealed that the stress-mitigating CO2 effect operates through up-regulation of antioxidant defense metabolism, as well as by reduced photorespiration resulting in lowered oxidative pressure. Therefore, exposure to future climate extreme episodes will negatively impact plant growth and production, but elevated CO2 is likely to mitigate this effect.
Full Text Available Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg.unam.mx/ is aimed at integrating the genetic regulatory network of E. coli K12 as an entirely bioinformatic project up till now. In this work, we extended its aims by generating experimental data at a genome scale on TSSs, promoters and regulatory regions. We implemented a modified 5' RACE protocol and an unbiased High Throughput Pyrosequencing Strategy (HTPS that allowed us to map more than 1700 TSSs with high precision. From this collection, about 230 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1500 TSSs mapped belong to about 1000 different genes, many of them with no assigned function. We identified promoter sequences and type of sigma factors that control the expression of about 80% of these genes. As expected, the housekeeping sigma(70 was the most common type of promoter, followed by sigma(38. The majority of the putative TSSs were located between 20 to 40 nucleotides from the translational start site. Putative regulatory binding sites for transcription factors were detected upstream of many TSSs. For a few transcripts, riboswitches and small RNAs were found. Several genes also had additional TSSs within the coding region. Unexpectedly, the HTPS experiments revealed extensive antisense transcription, probably for regulatory functions. The new information in RegulonDB, now with more than 2400 experimentally determined TSSs, strengthens the accuracy of promoter prediction, operon structure, and regulatory networks and provides valuable new information that will facilitate the understanding from a global perspective the complex and
Lin, Tian; Lashbrook, Coralie C; Cho, Sung Ki; Butler, Nathaniel M; Sharma, Pooja; Muppirala, Usha; Severin, Andrew J; Hannapel, David J
Numerous signal molecules, including proteins and mRNAs, are transported through the architecture of plants via the vascular system. As the connection between leaves and other organs, the petiole and stem are especially important in their transport function, which is carried out by the phloem and xylem, especially by the sieve elements in the phloem system. The phloem is an important conduit for transporting photosynthate and signal molecules like metabolites, proteins, small RNAs, and full-length mRNAs. Phloem sap has been used as an unadulterated source to profile phloem proteins and RNAs, but unfortunately, pure phloem sap cannot be obtained in most plant species. Here we make use of laser capture microdissection (LCM) and RNA-seq for an in-depth transcriptional profile of phloem-associated cells of both petioles and stems of potato. To expedite our analysis, we have taken advantage of the potato genome that has recently been fully sequenced and annotated. Out of the 27 k transcripts assembled that we identified, approximately 15 k were present in phloem-associated cells of petiole and stem with greater than ten reads. Among these genes, roughly 10 k are affected by photoperiod. Several RNAs from this day length-regulated group are also abundant in phloem cells of petioles and encode for proteins involved in signaling or transcriptional control. Approximately 22 % of the transcripts in phloem cells contained at least one binding motif for Pumilio, Nova, or polypyrimidine tract-binding proteins in their downstream sequences. Highlighting the predominance of binding processes identified in the gene ontology analysis of active genes from phloem cells, 78 % of the 464 RNA-binding proteins present in the potato genome were detected in our phloem transcriptome. As a reasonable alternative when phloem sap collection is not possible, LCM can be used to isolate RNA from specific cell types, and along with RNA-seq, provides practical access to expression profiles of
Full Text Available BACKGROUND: Animal and human studies suggest that inflammation is associated with behavioral disorders including aggression. We have recently shown that physical aggression of boys during childhood is strongly associated with reduced plasma levels of cytokines IL-1α, IL-4, IL-6, IL-8 and IL-10, later in early adulthood. This study tests the hypothesis that there is an association between differential DNA methylation regions in cytokine genes in T cells and monocytes DNA in adult subjects and a trajectory of physical aggression from childhood to adolescence. METHODOLOGY/PRINCIPAL FINDINGS: We compared the methylation profiles of the entire genomic loci encompassing the IL-1α, IL-6, IL-4, IL-10 and IL-8 and three of their regulatory transcription factors (TF NFkB1, NFAT5 and STAT6 genes in adult males on a chronic physical aggression trajectory (CPA and males with the same background who followed a normal physical aggression trajectory (control group from childhood to adolescence. We used the method of methylated DNA immunoprecipitation with comprehensive cytokine gene loci and TF loci microarray hybridization, statistical analysis and false discovery rate correction. We found differentially methylated regions to associate with CPA in both the cytokine loci as well as in their transcription factors loci analyzed. Some of these differentially methylated regions were located in known regulatory regions whereas others, to our knowledge, were previously unknown as regulatory areas. However, using the ENCODE database, we were able to identify key regulatory elements in many of these regions that indicate that they might be involved in the regulation of cytokine expression. CONCLUSIONS: We provide here the first evidence for an association between differential DNA methylation in cytokines and their regulators in T cells and monocytes and male physical aggression.
Sun, Ning; Zhao, Huimin
Transcription activator-like effector (TALE) nucleases (TALENs) have recently emerged as a revolutionary genome editing tool in many different organisms and cell types. The site-specific chromosomal double-strand breaks introduced by TALENs significantly increase the efficiency of genomic modification. The modular nature of the TALE central repeat domains enables researchers to tailor DNA recognition specificity with ease and target essentially any desired DNA sequence. Here, we comprehensively review the development of TALEN technology in terms of scaffold optimization, DNA recognition, and repeat array assembly. In addition, we provide some perspectives on the future development of this technology.
Taheri-Ghahfarokhi, Amir; Malaver-Ortega, Luis F; Sumer, Huseyin
Interest is increasing in transcription activator-like effector nucleases (TALENs) as a tool to introduce targeted double-strand breaks into the large genomes of human and animal cell lines. The produced DNA lesions stimulate DNA repair pathways, error-prone but dominant non-homologous end joining (NHEJ) and accurate but less occurring homology-directed repair (HDR), and as a result targeted genes can be modified. Here, we describe a modified Golden-Gate cloning method for generating TALENs and also details for targeting genes in mouse embryonic stem cells. The protocol described here can be used for modifying the genome of a broad range of pluripotent cell lines.
Lux, Heike; Flammann, Heiko; Hafner, Mathias; Lux, Andreas
The paternally expressed gene PEG10 is a retrotransposon derived gene adapted through mammalian evolution located on human chromosome 7q21. PEG10 codes for at least two proteins, PEG10-RF1 and PEG10-RF1/2, by -1 frameshift translation. Overexpression or reinduced PEG10 expression was seen in malignancies, like hepatocellular carcinoma or B-cell acute and chronic lymphocytic leukemia. PEG10 was also shown to promote adipocyte differentiation. Experimental evidence suggests that the PEG10-RF1 protein is an inhibitor of apoptosis and mediates cell proliferation. Here we present new data on the genomic organization of PEG10 by identifying the major transcription start site, a new splice variant and report the cloning and analysis of 1.9 kb of the PEG10 promoter. Furthermore, we show for the first time that PEG10 translation is initiated at a non-AUG start codon upstream of the previously predicted AUG codon as well as at the AUG codon. The finding that PEG10 translation is initiated at different sides adds a new aspect to the already interesting feature of PEG10's -1 frameshift translation mechanism. It is now important to unravel the cellular functions of the PEG10 protein variants and how they are related to normal or pathological conditions. The generated promoter-reporter constructs can be used for future studies to investigate how PEG10 expression is regulated. In summary, our study provides new data on the genomic organization as well as expression and translation of PEG10, a prerequisite in order to study and understand the role of PEG10 in cancer, embryonic development and normal cell homeostasis.
Full Text Available The paternally expressed gene PEG10 is a retrotransposon derived gene adapted through mammalian evolution located on human chromosome 7q21. PEG10 codes for at least two proteins, PEG10-RF1 and PEG10-RF1/2, by -1 frameshift translation. Overexpression or reinduced PEG10 expression was seen in malignancies, like hepatocellular carcinoma or B-cell acute and chronic lymphocytic leukemia. PEG10 was also shown to promote adipocyte differentiation. Experimental evidence suggests that the PEG10-RF1 protein is an inhibitor of apoptosis and mediates cell proliferation. Here we present new data on the genomic organization of PEG10 by identifying the major transcription start site, a new splice variant and report the cloning and analysis of 1.9 kb of the PEG10 promoter. Furthermore, we show for the first time that PEG10 translation is initiated at a non-AUG start codon upstream of the previously predicted AUG codon as well as at the AUG codon. The finding that PEG10 translation is initiated at different sides adds a new aspect to the already interesting feature of PEG10's -1 frameshift translation mechanism. It is now important to unravel the cellular functions of the PEG10 protein variants and how they are related to normal or pathological conditions. The generated promoter-reporter constructs can be used for future studies to investigate how PEG10 expression is regulated. In summary, our study provides new data on the genomic organization as well as expression and translation of PEG10, a prerequisite in order to study and understand the role of PEG10 in cancer, embryonic development and normal cell homeostasis.
Full Text Available Several transcription factors (TFs coordinate to regulate expression of specific genes at the transcriptional level. In Arabidopsis thaliana it is estimated that approximately 10% of all genes encode TFs or TF-like proteins. It is important to identify target genes that are directly regulated by TFs in order to understand the complete picture of a plant’s transcriptome profile. Here, we investigate the role of the LONG HYPOCOTYL5 (HY5 transcription factor that acts as a regulator of photomorphogenesis. We used an in vitro genomic DNA binding assay coupled with immunoprecipitation and next-generation sequencing (gDB-seq instead of the in vivo chromatin immunoprecipitation (ChIP-based methods. The results demonstrate that the HY5-binding motif predicted here was similar to the motif reported previously and that in vitro HY5-binding loci largely overlapped with the HY5-targeted candidate genes identified in previous ChIP-chip analysis. By combining these results with microarray analysis, we identified hundreds of HY5-binding genes that were differentially expressed in hy5. We also observed delayed induction of some transcripts of HY5-binding genes in hy5 mutants in response to blue-light exposure after dark treatment. Thus, an in vitro gDNA-binding assay coupled with sequencing is a convenient and powerful method to bridge the gap between identifying TF binding potential and establishing function.
Imam, Saheed; Noguera, Daniel R; Donohue, Timothy J
Photosynthesis is a crucial biological process that depends on the interplay of many components. This work analyzed the gene targets for 4 transcription factors: FnrL, PrrA, CrpK and MppG (RSP_2888), which are known or predicted to control photosynthesis in Rhodobacter sphaeroides. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) identified 52 operons under direct control of FnrL, illustrating its regulatory role in photosynthesis, iron homeostasis, nitrogen metabolism and regulation of sRNA synthesis. Using global gene expression analysis combined with ChIP-seq, we mapped the regulons of PrrA, CrpK and MppG. PrrA regulates ∼34 operons encoding mainly photosynthesis and electron transport functions, while CrpK, a previously uncharacterized Crp-family protein, regulates genes involved in photosynthesis and maintenance of iron homeostasis. Furthermore, CrpK and FnrL share similar DNA binding determinants, possibly explaining our observation of the ability of CrpK to partially compensate for the growth defects of a ΔFnrL mutant. We show that the Rrf2 family protein, MppG, plays an important role in photopigment biosynthesis, as part of an incoherent feed-forward loop with PrrA. Our results reveal a previously unrealized, high degree of combinatorial regulation of photosynthetic genes and significant cross-talk between their transcriptional regulators, while illustrating previously unidentified links between photosynthesis and the maintenance of iron homeostasis.
Rahm, Alanna Kulchak; Stuckey, Heather; Green, Jamie; Feldman, Lynn; Zallen, Doris T.; Bonhag, Michele; Segal, Michael M.; Fan, Audrey L.; Williams, Marc S.
This study reports on the responses of physicians who reviewed provider and patient versions of a genomic laboratory report designed to communicate results of whole genome sequencing. Semi‐structured interviews addressed concept communication, elements, and format of example genome reports. Analysis of the coded transcripts resulted in recognition of three constructs around communication of genome sequencing results: (1) Providers agreed that whole genomic sequencing results are complex and they welcomed a report that provided supportive interpretation information to accompany sequencing results; (2) Providers strongly endorsed a report that included active clinical guidance, such as reference to practice guidelines, if available; and (3) Providers valued the genomic report as a resource that would serve as the basis to facilitate communication of genome sequencing results with their patients and families. Providers valued both versions of the report, though they affirmed the need for a provider‐oriented report. Critical elements of the report included clear language to explain the result, as well as consolidated yet comprehensive prognostic information with clear guidance over time for the clinical care of the patient. Most importantly, it appears a report with this design has the potential not only to return results but also serves as a communication tool to help providers and patients discuss and coordinate care over time. © 2016 The Authors. American Journal of Medical Genetics Part A published by Wiley Periodicals, Inc. PMID:26842872
Zhou, D; Yang, R
Prokaryotes have complex mechanisms to regulate their gene transcription, through the action of transcription factors (TFs). This review deals with current strategies, approaches and challenges in the understanding of i) how to map the repertoires of TF and operon on a genome, ii) how to identify the specific cis-acting DNA elements and their DNA-binding TFs that are required for expression of a given gene, iii) how to define the regulon members of a given TF, iv) how a given TF interacts with its target promoters, v) how these TF-promoter DNA interactions constitute regulatory networks, and vi) how transcriptional regulatory networks can be reconstructed by the reverse-engineering methods. Our goal is to depict the power of newly developed genomic techniques and computational tools, alone or in combination, to dissect the genetic circuitry of transcription regulation, and how this has the tremendous potential to model the regulatory networks in the prokaryotic cells.
Gu Yong Q
Full Text Available Abstract Background Among the dietary essential amino acids, the most severely limiting in the cereals is lysine. Since cereals make up half of the human diet, lysine limitation has quality/nutritional consequences. The breakdown of lysine is controlled mainly by the catabolic bifunctional enzyme lysine ketoglutarate reductase - saccharopine dehydrogenase (LKR/SDH. The LKR/SDH gene has been reported to produce transcripts for the bifunctional enzyme and separate monofunctional transcripts. In addition to lysine metabolism, this gene has been implicated in a number of metabolic and developmental pathways, which along with its production of multiple transcript types and complex exon/intron structure suggest an important node in plant metabolism. Understanding more about the LKR/SDH gene is thus interesting both from applied standpoint and for basic plant metabolism. Results The current report describes a wheat genomic fragment containing an LKR/SDH gene and adjacent genes. The wheat LKR/SDH genomic segment was found to originate from the A-genome of wheat, and EST analysis indicates all three LKR/SDH genes in hexaploid wheat are transcriptionally active. A comparison of a set of plant LKR/SDH genes suggests regions of greater sequence conservation likely related to critical enzymatic functions and metabolic controls. Although most plants contain only a single LKR/SDH gene per genome, poplar contains at least two functional bifunctional genes in addition to a monofunctional LKR gene. Analysis of ESTs finds evidence for monofunctional LKR transcripts in switchgrass, and monofunctional SDH transcripts in wheat, Brachypodium, and poplar. Conclusions The analysis of a wheat LKR/SDH gene and comparative structural and functional analyses among available plant genes provides new information on this important gene. Both the structure of the LKR/SDH gene and the immediately adjacent genes show lineage-specific differences between monocots and dicots, and
Full Text Available The integrator complex has been recently identified as a key regulator of RNA Polymerase II-mediated transcription, with many functions including the processing of small nuclear RNAs, the pause-release and elongation of polymerase during the transcription of protein coding genes, and the biogenesis of enhancer derived transcripts. Moreover, some of its components also play a role in genome maintenance. Thus, it is reasonable to hypothesize that their functional impairment or altered expression can contribute to malignancies. Indeed, several studies have described the mutations or transcriptional alteration of some Integrator genes in different cancers. Here, to draw a comprehensive pan-cancer picture of the genomic and transcriptomic alterations for the members of the complex, we reanalyzed public data from The Cancer Genome Atlas. Somatic mutations affecting Integrator subunit genes and their transcriptional profiles have been investigated in about 11,000 patients and 31 tumor types. A general heterogeneity in the mutation frequencies was observed, mostly depending on tumor type. Despite the fact that we could not establish them as cancer drivers, INTS7 and INTS8 genes were highly mutated in specific cancers. A transcriptome analysis of paired (normal and tumor samples revealed that the transcription of INTS7, INTS8, and INTS13 is significantly altered in several cancers. Experimental validation performed on primary tumors confirmed these findings.
Yokomori, Rui; Shimai, Kotaro; Nishitsuji, Koki; Suzuki, Yutaka; Kusakabe, Takehiro G; Nakai, Kenta
The tunicate Ciona intestinalis, an invertebrate chordate, has recently emerged as a powerful model organism for gene regulation analysis. However, few studies have been conducted to identify and characterize its transcription start sites (TSSs) and promoters at the genome-wide level. Here, using TSS-seq, we identified TSSs at the genome-wide scale and characterized promoters in C. intestinalis. Specifically, we identified TSS clusters (TSCs), high-density regions of TSS-seq tags, each of which appears to originate from an identical promoter. TSCs were found not only at known TSSs but also in other regions, suggesting the existence of many unknown transcription units in the genome. We also identified candidate promoters of 79 ribosomal protein (RP) genes, each of which had the major TSS in a polypyrimidine tract and showed a sharp TSS distribution like human RP gene promoters. Ciona RP gene promoters, however, did not appear to have typical TATA boxes, unlike human RP gene promoters. In Ciona non-RP promoters, two pyrimidine-purine dinucleotides, CA and TA, were frequently used as TSSs. Despite the absence of CpG islands, Ciona TATA-less promoters showed low expression specificity like CpG-associated human TATA-less promoters. By using TSS-seq, we also predicted trans-spliced gene TSSs and found that their downstream regions had higher G+T content than those of non-trans-spliced gene TSSs. Furthermore, we identified many putative alternative promoters, some of which were regulated in a tissue-specific manner. Our results provide valuable information about TSSs and promoter characteristics in C. intestinalis and will be helpful in future analysis of transcriptional regulation in chordates.
Ruiz-Llorente, Sergio; de Pau, Enrique Carrillo Santa; Sastre-Perona, Ana; Montero-Conde, Cristina; Gómez-López, Gonzalo; Fagin, James A.; Valencia, Alfonso; Pisano, David G.; Santisteban, Pilar
Background The transcription factor Pax8 is essential for the differentiation of thyroid cells. However, there are few data on genes transcriptionally regulated by Pax8 other than thyroid-related genes. To better understand the role of Pax8 in the biology of thyroid cells, we obtained transcriptional profiles of Pax8-silenced PCCl3 thyroid cells using whole genome expression arrays and integrated these signals with global cis-regulatory sequencing studies performed by ChIP-Seq analysis Result...
Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh
Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.
Saito, Natsumi; Maeda, Michihisa; Tanaka, Kan; Ishihama, Akira
Leucine-responsive regulatory protein (Lrp) is a transcriptional regulator for the genes involved in transport, biosynthesis and catabolism of amino acids in Escherichia coli. In order to identify the whole set of genes under the direct control of Lrp, we performed Genomic SELEX screening and identified a total of 314 Lrp-binding sites on the E. coli genome. As a result, the regulation target of Lrp was predicted to expand from the hitherto identified genes for amino acid metabolism to a set of novel target genes for utilization of amino acids for protein synthesis, including tRNAs, aminoacyl-tRNA synthases and rRNAs. Northern blot analysis indicated alteration of mRNA levels for at least some novel targets, including the aminoacyl-tRNA synthetase genes. Phenotype MicroArray of the lrp mutant indicated significant alteration in utilization of amino acids and peptides, whilst metabolome analysis showed variations in the concentration of amino acids in the lrp mutant. From these two datasets we realized a reverse correlation between amino acid levels and cell growth rate: fast-growing cells contain low-level amino acids, whilst a high level of amino acids exists in slow-growing cells. Taken together, we propose that Lrp is a global regulator of transcription of a large number of the genes involved in not only amino acid transport and metabolism, but also amino acid utilization. PMID:28348809
Lorenz, C.; Gesell, T.; Zimmermann, B.; Schoeberl, U.; Bilusic, I.; Rajkowitsch, L.; Waldsich, C.; von Haeseler, A.; Schroeder, R.
An unexpectedly high number of regulatory RNAs have been recently discovered that fine-tune the function of genes at all levels of expression. We employed Genomic SELEX, a method to identify protein-binding RNAs encoded in the genome, to search for further regulatory RNAs in Escherichia coli. We used the global regulator protein Hfq as bait, because it can interact with a large number of RNAs, promoting their interaction. The enriched SELEX pool was subjected to deep sequencing, and 8865 sequences were mapped to the E. coli genome. These short sequences represent genomic Hfq-aptamers and are part of potential regulatory elements within RNA molecules. The motif 5′-AAYAAYAA-3′ was enriched in the selected RNAs and confers low-nanomolar affinity to Hfq. The motif was confirmed to bind Hfq by DMS footprinting. The Hfq aptamers are 4-fold more frequent on the antisense strand of protein coding genes than on the sense strand. They were enriched opposite to translation start sites or opposite to intervening sequences between ORFs in operons. These results expand the repertoire of Hfq targets and also suggest that Hfq might regulate the expression of a large number of genes via interaction with cis-antisense RNAs. PMID:20348540
Full Text Available Transcription factors (TFs are master gene products that regulate gene expression in response to a variety of stimuli. They interact with DNA in a sequence-specific manner using a variety of DNA-binding domain (DBD modules. This allows to properly position their second domain, called “effector domain”, to directly or indirectly recruit positively or negatively acting co-regulators including chromatin modifiers, thus modulating preinitiation complex formation as well as transcription elongation. At variance with the DBDs, which are comprised of well-defined and easily recognizable DNA binding motifs, effector domains are usually much less conserved and thus considerably more difficult to predict. Also not so easy to identify are the DNA-binding sites of TFs, especially on a genome-wide basis and in the case of overlapping binding regions. Another emerging issue, with many potential regulatory implications, is that of so-called “moonlighting” transcription factors, i.e., proteins with an annotated function unrelated to transcription and lacking any recognizable DBD or effector domain, that play a role in gene regulation as their second job. Starting from bioinformatic and experimental high-throughput tools for an unbiased, genome-wide identification and functional characterization of TFs (especially transcriptional activators, we describe both established (and usually well affordable as well as newly developed platforms for DNA-binding site identification. Selected combinations of these search tools, some of which rely on next-generation sequencing approaches, allow delineating the entire repertoire of TFs and unconventional regulators encoded by the any sequenced genome.
Pereira-Santana, Alejandro; Alcaraz, Luis David; Castaño, Enrique; Sanchez-Calderon, Lenin; Sanchez-Teyer, Felipe; Rodriguez-Zapata, Luis
NAC proteins constitute one of the largest groups of plant-specific transcription factors and are known to play essential roles in various developmental processes. They are also important in plant responses to stresses such as drought, soil salinity, cold, and heat, which adversely affect growth. The current knowledge regarding the distribution of NAC proteins in plant lineages comes from relatively small samplings from the available data. In the present study, we broadened the number of plant species containing the NAC family origin and evolution to shed new light on the evolutionary history of this family in angiosperms. A comparative genome analysis was performed on 24 land plant species, and NAC ortholog groups were identified by means of bidirectional BLAST hits. Large NAC gene families are found in those species that have experienced more whole-genome duplication events, pointing to an expansion of the NAC family with divergent functions in flowering plants. A total of 3,187 NAC transcription factors that clustered into six major groups were used in the phylogenetic analysis. Many orthologous groups were found in the monocot and eudicot lineages, but only five orthologous groups were found between P. patens and each representative taxa of flowering plants. These groups were called basal orthologous groups and likely expanded into more recent taxa to cope with their environmental needs. This analysis on the angiosperm NAC family represents an effort to grasp the evolutionary and functional diversity within this gene family while providing a basis for further functional research on vascular plant gene families.
Zhang, Jialing; Yan, Bin; Späth, Stephan Stanislaw; Qun, Hu; Cornelius, Shaleeka; Guan, Daogang; Shao, Jiaofang; Hagiwara, Koichi; Van Waes, Carter; Chen, Zhong; Su, Xiulan; Bi, Yongyi
Colorectal cancer (CRC) is a heterogeneous disease that is associated with a gradual accumulation of genetic and epigenetic alterations. Among all CRC stages, stage II tumors are highly heterogeneous with a high relapse rate in about 20-25 % of stage II CRC patients following surgery. Thus, a comprehensive analysis of gene signatures to identify aggressive and metastatic phenotypes in stage II CRC is desired for a more accurate disease classification and outcome prediction. By utilizing a Cancer Array, containing 440 oncogenes and tumor suppressors to profile mRNA expression, we identified a larger number of differentially expressed genes in poorly differentiated stage II colorectal adenocarcinoma tissues, compared to their matched normal tissues. Ontology and Ingenuity Pathway Analysis (IPA) indicated that these genes are involved in functional mechanisms associated with several transcription factors. Genomic alterations of these genes were also investigated through The Cancer Genome Atlas (TCGA) database, utilizing 195 published CRC specimens. The percentage of genomic alterations in these genes was ranked based on their mRNA expression, copy number variations and mutations. This data was further combined with published microarray studies from a large set of CRC tumors classified based on prognostic features. This led to the identification of eight candidate genes including RPN2, HMGB1, AARS, IGFBP3, STAT1, HYOU1, NQO1 and PEA15 that were associated with the progressive phenotype. In particular, RPN2 and HMGB1 displayed a higher genomic alteration frequency in CRC, compared to eight other major solid cancers. Immunohistochemistry was performed on additional 78 stage I-IV CRC samples, where RPN2 protein immunostaining exhibited a significant association with stage III/IV tumors, distant metastasis, and poor differentiation, indicating that RPN2 expression is associated with poor prognosis. Further, our study revealed significant transcriptional regulatory
Lowder, Levi G; Zhang, Dengwei; Baltes, Nicholas J; Paul, Joseph W; Tang, Xu; Zheng, Xuelian; Voytas, Daniel F; Hsieh, Tzung-Fu; Zhang, Yong; Qi, Yiping
The relative ease, speed, and biological scope of clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated Protein9 (Cas9)-based reagents for genomic manipulations are revolutionizing virtually all areas of molecular biosciences, including functional genomics, genetics, applied biomedical research, and agricultural biotechnology. In plant systems, however, a number of hurdles currently exist that limit this technology from reaching its full potential. For example, significant plant molecular biology expertise and effort is still required to generate functional expression constructs that allow simultaneous editing, and especially transcriptional regulation, of multiple different genomic loci or multiplexing, which is a significant advantage of CRISPR/Cas9 versus other genome-editing systems. To streamline and facilitate rapid and wide-scale use of CRISPR/Cas9-based technologies for plant research, we developed and implemented a comprehensive molecular toolbox for multifaceted CRISPR/Cas9 applications in plants. This toolbox provides researchers with a protocol and reagents to quickly and efficiently assemble functional CRISPR/Cas9 transfer DNA constructs for monocots and dicots using Golden Gate and Gateway cloning methods. It comes with a full suite of capabilities, including multiplexed gene editing and transcriptional activation or repression of plant endogenous genes. We report the functionality and effectiveness of this toolbox in model plants such as tobacco (Nicotiana benthamiana), Arabidopsis (Arabidopsis thaliana), and rice (Oryza sativa), demonstrating its utility for basic and applied plant research. © 2015 American Society of Plant Biologists. All Rights Reserved.
Ui-Tei, Kumiko; Maruyama, Shohei; Nakano, Yuko
Genomic engineering using clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) protein is a promising approach for targeting the genomic DNA of virtually any organism in a sequence-specific manner. Recent remarkable advances in CRISPR/Cas technology have made it a feasible system for use in therapeutic applications and biotechnology. In the CRISPR/Cas system, a guide RNA (gRNA), interacting with the Cas protein, recognizes a genomic region with sequence complementarity, and the double-stranded DNA at the target site is cleaved by the Cas protein. A widely used gRNA is an RNA polymerase III (pol III)-driven single gRNA (sgRNA), which is produced by artificial fusion of CRISPR RNA (crRNA) and trans-activation crRNA (tracrRNA). However, we identified a TTTT stretch, known as a termination signal of RNA pol III, in the scaffold region of the sgRNA. Here, we revealed that sgRNA carrying a TTTT stretch reduces the efficiency of sgRNA transcription due to premature transcriptional termination, and decreases the efficiency of genome editing. Unexpectedly, it was also shown that the premature terminated sgRNA may have an adverse effect of inducing RNA interference. Such disadvantageous effects were avoided by substituting one base in the TTTT stretch.
Zhi-Hua Liu; Dian Jiao; Xiao Sun
Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.
Morbitzer, Robert; Römer, Patrick; Boch, Jens; Lahaye, Thomas
Proteins that can be tailored to bind desired DNA sequences are key tools for molecular biology. Previous studies suggested that DNA-binding specificity of transcription activator-like effectors (TALEs) from the bacterial genus Xanthomonas is defined by repeat-variable diresidues (RVDs) of tandem-arranged 34/35-amino acid repeat units. We have studied chimeras of two TALEs differing in RVDs and non-RVDs and found that, in contrast to the critical contributions by RVDs, non-RVDs had no major effect on the DNA-binding specificity of the chimeras. This finding suggests that one needs only to modify the RVDs to generate designer TALEs (dTALEs) to activate transcription of user-defined target genes. We used the scaffold of the TALE AvrBs3 and changed its RVDs to match either the tomato Bs4, the Arabidopsis EGL3, or the Arabidopsis KNAT1 promoter. All three dTALEs transcriptionally activated the desired promoters in a sequence-specific manner as mutations within the targeted DNA sequences abolished promoter activation. This study is unique in showing that chromosomal loci can be targeted specifically by dTALEs. We also engineered two AvrBs3 derivatives with four additional repeat units activating specifically either the pepper Bs3 or UPA20 promoter. Because AvrBs3 activates both promoters, our data show that addition of repeat units facilitates TALE-specificity fine-tuning. Finally, we demonstrate that the RVD NK mediates specific interaction with G nucleotides that thus far could not be targeted specifically by any known RVD type. In summary, our data demonstrate that the TALE scaffold can be tailored to target user-defined DNA sequences in whole genomes.
Dewey Colin N
Full Text Available Abstract Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost
Bruder, Mark R.; Pyne, Michael E.; Moo-Young, Murray
ABSTRACT The discovery and exploitation of the prokaryotic adaptive immunity system based on clustered regularly interspaced short palindromic repeats (CRISPRs) and CRISPR-associated (Cas) proteins have revolutionized genetic engineering. CRISPR-Cas tools have enabled extensive genome editing as well as efficient modulation of the transcriptional program in a multitude of organisms. Progress in the development of genetic engineering tools for the genus Clostridium has lagged behind that of many other prokaryotes, presenting the CRISPR-Cas technology an opportunity to resolve a long-existing issue. Here, we applied the Streptococcus pyogenes type II CRISPR-Cas9 (SpCRISPR-Cas9) system for genome editing in Clostridium acetobutylicum DSM792. We further explored the utility of the SpCRISPR-Cas9 machinery for gene-specific transcriptional repression. For proof-of-concept demonstration, a plasmid-encoded fluorescent protein gene was used for transcriptional repression in C. acetobutylicum. Subsequently, we targeted the carbon catabolite repression (CCR) system of C. acetobutylicum through transcriptional repression of the hprK gene encoding HPr kinase/phosphorylase, leading to the coutilization of glucose and xylose, which are two abundant carbon sources from lignocellulosic feedstocks. Similar approaches based on SpCRISPR-Cas9 for genome editing and transcriptional repression were also demonstrated in Clostridium pasteurianum ATCC 6013. As such, this work lays a foundation for the derivation of clostridial strains for industrial purposes. IMPORTANCE After recognizing the industrial potential of Clostridium for decades, methods for the genetic manipulation of these anaerobic bacteria are still underdeveloped. This study reports the implementation of CRISPR-Cas technology for genome editing and transcriptional regulation in Clostridium acetobutylicum, which is arguably the most common industrial clostridial strain. The developed genetic tools enable simpler, more reliable
Reuß, Daniel R; Altenbuchner, Josef; Mäder, Ulrike; Rath, Hermann; Ischebeck, Till; Sappa, Praveen Kumar; Thürmer, Andrea; Guérin, Cyprien; Nicolas, Pierre; Steil, Leif; Zhu, Bingyao; Feussner, Ivo; Klumpp, Stefan; Daniel, Rolf; Commichau, Fabian M; Völker, Uwe; Stülke, Jörg
Understanding cellular life requires a comprehensive knowledge of the essential cellular functions, the components involved, and their interactions. Minimized genomes are an important tool to gain this knowledge. We have constructed strains of the model bacterium, Bacillus subtilis, whose genomes have been reduced by ∼36%. These strains are fully viable, and their growth rates in complex medium are comparable to those of wild type strains. An in-depth multi-omics analysis of the genome reduced strains revealed how the deletions affect the transcription regulatory network of the cell, translation resource allocation, and metabolism. A comparison of gene counts and resource allocation demonstrates drastic differences in the two parameters, with 50% of the genes using as little as 10% of translation capacity, whereas the 6% essential genes require 57% of the translation resources. Taken together, the results are a valuable resource on gene dispensability in B. subtilis, and they suggest the roads to further genome reduction to approach the final aim of a minimal cell in which all functions are understood.
Alic, Nazif; Felder, Thomas; Temple, Mark D; Gloeckner, Christian; Higgins, Vincent J; Briza, Peter; Dawes, Ian W
Free radicals can initiate the oxidation of polyunsaturated fatty acids in cells through the process of lipid peroxidation. The genome-wide transcriptional changes in Saccharomyces cerevisiae after treatment with the toxic lipid peroxidation product linoleic acid hydroperoxide (LoaOOH) were identified. High-dose treatment led to a switch in transcription from biosynthetic to protective functions. This response encompassed a set of genes stimulated predominantly by LoaOOH, and not by other oxidants or heat shock, which contained components of the pleiotropic drug resistance system. The dose dependence of the transcriptional response revealed that large and widespread changes occur only in response to higher doses. Pretreatment of cells with sublethal doses of LoaOOH induces resistance to an otherwise lethal dose through the process of adaptation. Adaptive doses elicited a more subtle transcriptional response affecting metabolic functions, including an increase in the capacity for detoxification and downregulation of the rate of protein synthesis. Surprisingly, the cellular response to adaptive doses did not include induction of oxidative-stress defense enzymes nor of transcripts involved in general cellular defense systems.
Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang
The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.
Wang, Yukun; Qiao, Linyi; Bai, Jianfang; Wang, Peng; Duan, Wenjing; Yuan, Shaohua; Yuan, Guoliang; Zhang, Fengting; Zhang, Liping; Zhao, Changping
The JASMONATE-ZIM DOMAIN (JAZ) repressor family proteins are jasmonate co-receptors and transcriptional repressor in jasmonic acid (JA) signaling pathway, and they play important roles in regulating the growth and development of plants. Recently, more and more researches on JAZ gene family are reported in many plants. Although the genome sequencing of common wheat (Triticum aestivum L.) and its relatives is complete, our knowledge about this gene family remains vacant. Fourteen JAZ genes were identified in the wheat genome. Structural analysis revealed that the TaJAZ proteins in wheat were as conserved as those in other plants, but had structural characteristics. By phylogenetic analysis, all JAZ proteins from wheat and other plants were clustered into 11 sub-groups (G1-G11), and TaJAZ proteins shared a high degree of similarity with some JAZ proteins from Aegliops tauschii, Brachypodium distachyon and Oryza sativa. The Ka/Ks ratios of TaJAZ genes ranged from 0.0016 to 0.6973, suggesting that the TaJAZ family had undergone purifying selection in wheat. Gene expression patterns obtained by quantitative real-time PCR (qRT-PCR) revealed differential temporal and spatial regulation of TaJAZ genes under multifarious abiotic stress treatments of high salinity, drought, cold and phytohormone. Among these, TaJAZ7, 8 and 12 were specifically expressed in the anther tissues of the thermosensitive genic male sterile (TGMS) wheat line BS366 and normal control wheat line Jing411. Compared with the gene expression patterns in the normal wheat line Jing411, TaJAZ7, 8 and 12 had different expression patterns in abnormally dehiscent anthers of BS366 at the heading stage 6, suggesting that specific up- or down-regulation of these genes might be associated with the abnormal anther dehiscence in TGMS wheat line. This study analyzed the size and composition of the JAZ gene family in wheat, and investigated stress responsive and differential tissue-specific expression profiles of each
Full Text Available Abstract Background Clostridium beijerinckii is a prominent solvent-producing microbe that has great potential for biofuel and chemical industries. Although transcriptional analysis is essential to understand gene functions and regulation and thus elucidate proper strategies for further strain improvement, limited information is available on the genome-wide transcriptional analysis for C. beijerinckii. Results The genome-wide transcriptional dynamics of C. beijerinckii NCIMB 8052 over a batch fermentation process was investigated using high-throughput RNA-Seq technology. The gene expression profiles indicated that the glycolysis genes were highly expressed throughout the fermentation, with comparatively more active expression during acidogenesis phase. The expression of acid formation genes was down-regulated at the onset of solvent formation, in accordance with the metabolic pathway shift from acidogenesis to solventogenesis. The acetone formation gene (adc, as a part of the sol operon, exhibited highly-coordinated expression with the other sol genes. Out of the > 20 genes encoding alcohol dehydrogenase in C. beijerinckii, Cbei_1722 and Cbei_2181 were highly up-regulated at the onset of solventogenesis, corresponding to their key roles in primary alcohol production. Most sporulation genes in C. beijerinckii 8052 demonstrated similar temporal expression patterns to those observed in B. subtilis and C. acetobutylicum, while sporulation sigma factor genes sigE and sigG exhibited accelerated and stronger expression in C. beijerinckii 8052, which is consistent with the more rapid forespore and endspore development in this strain. Global expression patterns for specific gene functional classes were examined using self-organizing map analysis. The genes associated with specific functional classes demonstrated global expression profiles corresponding to the cell physiological variation and metabolic pathway switch. Conclusions The results from this
Akhilesh K. Tyagi; Jitendra P. Khurana; Paramjit Khurana; Saurabh Raghuvanshi; Anupama Gaur; Anita Kapur; Vikrant Gupta; Dibyendu Kumar; V. Ravi; Shubha Vij; Parul Khurana; Sulabha Sharma
Rice is an excellent system for plant genomics as it represents a modest size genome of 430 Mb. It feeds more than half the population of the world. Draft sequences of the rice genome, derived by whole-genome shotgun approach at relatively low coverage (4–6 X), were published and the International Rice Genome Sequencing Project (IRGSP) declared high quality (>10 X), genetically anchored, phase 2 level sequence in 2002. In addition, phase 3 level finished sequence of chromosomes 1, 4 and 10 (out of 12 chromosomes of rice) has already been reported by scientists from IRGSP consortium. Various estimates of genes in rice place the number at > 50,000. Already, over 28,000 full-length cDNAs have been sequenced, most of which map to genetically anchored genome sequence. Such information is very useful in revealing novel features of macro- and micro-level synteny of rice genome with other cereals. Microarray analysis is unraveling the identity of rice genes expressing in temporal and spatial manner and should help target candidate genes useful for improving traits of agronomic importance. Simultaneously, functional analysis of rice genome has been initiated by marker-based characterization of useful genes and employing functional knock-outs created by mutation or gene tagging. Integration of this enormous information is expected to catalyze tremendous activity on basic and applied aspects of rice genomics.
Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.
Jakobi, Tobias; Brinkrolf, Karina; Tauch, Andreas; Noll, Thomas; Stoye, Jens; Pühler, Alfred; Goesmann, Alexander
Chinese hamster ovary (CHO) cell lines are one of the major production tools for monoclonal antibodies, recombinant proteins, and therapeutics. Although many efforts have significantly improved the availability of sequence information for CHO cells in the last years, forthcoming draft genomes still lack the information depth known from the mouse or human genomes. Many genes annotated for CHO cells and the Chinese hamster reference genome still are in silico predictions, only insufficiently verified by biological experiments. The correct annotation of transcription start sites (TSSs) is of special interest for CHO cells, as these directly define the location of the eukaryotic core promoter. Our study aims to elucidate these largely unexplored regions, trying to shed light on promoter landscapes in the Chinese hamster genome. Based on a 5' enriched dual library RNA sequencing approach 6547 TSSs were identified, of which over 90% were assigned to known genes. These TSSs were used to perform extensive promoter studies using a novel, modular bioinformatics pipeline, incorporating analyses of important regulatory elements of the eukaryotic core promoter on per-gene level and on genomic scale.
Suryamohan, Kushal; Halfon, Marc S.
Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908
Full Text Available Hemorrhagic stroke is a life-threatening disease characterized by a sudden rupture of cerebral blood vessels, and it is widely believed that neural cell death occurs after exposure to blood metabolites or subsequently damaged cells. Neural stem cells (NSCs, which maintain neurogenesis and are found in subgranular zone and subventricular zone, are thought to be an endogenous neuroprotective mechanism for these brain injuries. However, due to the complexity of NSCs and their microenvironment, current strategies cannot satisfactorily enhance functional recovery after hemorrhagic stroke. It is well known that transcriptional and genomic pathways play important roles in ensuring the normal functions of NSCs, including proliferation, migration, differentiation, and neural reconnection. Recently, emerging evidence from the use of new technologies such as next-generation sequencing and transcriptome profiling has provided insight into our understanding of genomic function and regulation of NSCs. In the present article, we summarize and present the current data on the control of NSCs at both the transcriptional and genomic levels. Using bioinformatics methods, we sought to predict novel therapeutic targets of endogenous neurogenesis and exogenous NSC transplantation for functional recovery after hemorrhagic stroke, which could also advance our understanding of its pathophysiology.
Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M.; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T.; Wilczynski, Grzegorz M.; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun
Summary Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced ChIA-PET strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CTCF and RNAPII with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes towards CTCF-foci for coordinated transcription. Furthermore, we show that haplotype-variants and allelic-interactions have differential effects on chromosome configuration influencing gene expression and may provide mechanistic insights into functions associated with disease susceptibility. 3D-genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D-genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. PMID:26686651
Full Text Available Abstract Background In the post-genomic era, comprehension of cellular processes and systems requires global and non-targeted approaches to handle vast amounts of biological information. Results The present study predicts transcription units (TUs in Bacillus subtilis, based on an integrated approach involving DNA sequence and transcriptome analyses. First, co-expressed gene clusters are predicted by calculating the Pearson correlation coefficients of adjacent genes for all the genes in a series that are transcribed in the same direction with no intervening gene transcribed in the opposite direction. Transcription factor (TF binding sites are then predicted by detecting statistically significant TF binding sequences on the genome using a position weight matrix. This matrix is a convenient way to identify sites that are more highly conserved than others in the entire genome because any sequence that differs from a consensus sequence has a lower score. We identify genes regulated by each of the TFs by comparing gene expression between wild-type and TF mutants using a one-sided test. By applying the integrated approach to 11 σ factors and 17 TFs of B. subtilis, we are able to identify fewer candidates for genes regulated by the TFs than were identified using any single approach, and also detect the known TUs efficiently. Conclusion This integrated approach is, therefore, an efficient tool for narrowing searches for candidate genes regulated by TFs, identifying TUs, and estimating roles of the σ factors and TFs in cellular processes and functions of genes composing the TUs.
Tian, Wenlan; Paudel, Dev
Jatropha (Jatropha curcas L.) is an economically important species with a great potential for biodiesel production. To enrich the jatropha genomic databases and resources for microgravity studies, we sequenced and annotated the transcriptome of jatropha and developed SSR and SNP markers from the transcriptome sequences. In total 1,714,433 raw reads with an average length of 441.2 nucleotides were generated. De novo assembling and clustering resulted in 115,611 uniquely assembled sequences (UASs) including 21,418 full-length cDNAs and 23,264 new jatropha transcript sequences. The whole set of UASs were fully annotated, out of which 59,903 (51.81%) were assigned with gene ontology (GO) term, 12,584 (10.88%) had orthologs in Eukaryotic Orthologous Groups (KOG), and 8,822 (7.63%) were mapped to 317 pathways in six different categories in Kyoto Encyclopedia of Genes and Genome (KEGG) database, and it contained 3,588 putative transcription factors. From the UASs, 9,798 SSRs were discovered with AG/CT as the most frequent (45.8%) SSR motif type. Further 38,693 SNPs were detected and 7,584 remained after filtering. This UAS set has enriched the current jatropha genomic databases and provided a large number of genetic markers, which can facilitate jatropha genetic improvement and many other genetic and biological studies. PMID:28154822
Sarachana, Tewarit; Hu, Valerie W
We have recently identified the nuclear hormone receptor RORA (retinoic acid-related orphan receptor-alpha) as a novel candidate gene for autism spectrum disorder (ASD). Our independent cohort studies have consistently demonstrated the reduction of RORA transcript and/or protein levels in blood-derived lymphoblasts as well as in the postmortem prefrontal cortex and cerebellum of individuals with ASD. Moreover, we have also shown that RORA has the potential to be under negative and positive regulation by androgen and estrogen, respectively, suggesting the possibility that RORA may contribute to the male bias of ASD. However, little is known about transcriptional targets of this nuclear receptor, particularly in humans. Here we identify transcriptional targets of RORA in human neuronal cells on a genome-wide level using chromatin immunoprecipitation (ChIP) with an anti-RORA antibody followed by whole-genome promoter array (chip) analysis. Selected potential targets of RORA were then validated by an independent ChIP followed by quantitative PCR analysis. To further demonstrate that reduced RORA expression results in reduced transcription of RORA targets, we determined the expression levels of the selected transcriptional targets in RORA-deficient human neuronal cells, as well as in postmortem brain tissues from individuals with ASD who exhibit reduced RORA expression. The ChIP-on-chip analysis reveals that RORA1, a major isoform of RORA protein in human brain, can be recruited to as many as 2,764 genomic locations corresponding to promoter regions of 2,544 genes across the human genome. Gene ontology analysis of this dataset of genes that are potentially directly regulated by RORA1 reveals statistically significant enrichment in biological functions negatively impacted in individuals with ASD, including neuronal differentiation, adhesion and survival, synaptogenesis, synaptic transmission and plasticity, and axonogenesis, as well as higher level functions such as
Full Text Available Thermococcus gammatolerans, the most radioresistant archaeon known to date, is an anaerobic and hyperthermophilic sulfur-reducing organism living in deep-sea hydrothermal vents. Knowledge of mechanisms underlying archaeal metal tolerance in such metal-rich ecosystem is still poorly documented. We showed that T. gammatolerans exhibits high resistance to cadmium (Cd, cobalt (Co and zinc (Zn, a weaker tolerance to nickel (Ni, copper (Cu and arsenate (AsO(4 and that cells exposed to 1 mM Cd exhibit a cellular Cd concentration of 67 µM. A time-dependent transcriptomic analysis using microarrays was performed at a non-toxic (100 µM and a toxic (1 mM Cd dose. The reliability of microarray data was strengthened by real time RT-PCR validations. Altogether, 114 Cd responsive genes were revealed and a substantial subset of genes is related to metal homeostasis, drug detoxification, re-oxidization of cofactors and ATP production. This first genome-wide expression profiling study of archaeal cells challenged with Cd showed that T. gammatolerans withstands induced stress through pathways observed in both prokaryotes and eukaryotes but also through new and original strategies. T. gammatolerans cells challenged with 1 mM Cd basically promote: 1 the induction of several transporter/permease encoding genes, probably to detoxify the cell; 2 the upregulation of Fe transporters encoding genes to likely compensate Cd damages in iron-containing proteins; 3 the induction of membrane-bound hydrogenase (Mbh and membrane-bound hydrogenlyase (Mhy2 subunits encoding genes involved in recycling reduced cofactors and/or in proton translocation for energy production. By contrast to other organisms, redox homeostasis genes appear constitutively expressed and only a few genes encoding DNA repair proteins are regulated. We compared the expression of 27 Cd responsive genes in other stress conditions (Zn, Ni, heat shock, γ-rays, and showed that the Cd transcriptional pattern is
Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh
We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.
Full Text Available The present study made attempts to update and revise eutherian kallikrein genes implicated in major physiological and pathological processes and in medical molecular diagnostics. Using eutherian comparative genomic analysis protocol and free available genomic sequence assemblies, the tests of reliability of eutherian public genomic sequences annotated most comprehensive curated third party data gene data set of eutherian kallikrein genes including 121 complete coding sequences among 335 potential coding sequences. The present analysis first described 13 major gene clusters of eutherian kallikrein genes, and explained their differential gene expansion patterns. One updated classification and nomenclature of eutherian kallikrein genes was proposed, as new framework of future experiments.
Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.
Hollenhorst, Peter C.; McIntosh, Lawrence P.; Graves, Barbara J.
ETS proteins are a group of evolutionarily related, DNA-binding transcriptional factors. These proteins direct gene expression in diverse normal and disease states by binding to specific promoters and enhancers and facilitating assembly of other components of the transcriptional machinery. The highly conserved DNA-binding ETS domain defines the family and is responsible for specific recognition of a common sequence motif, 5′-GGA(A/T)-3′. Attaining specificity for biological regulation in such a family is thus a conundrum. We present the current knowledge of routes to functional diversity and DNA binding specificity, including divergent properties of the conserved ETS and PNT domains, the involvement of flanking structured and unstructured regions appended to these dynamic domains, posttranslational modifications, and protein partnerships with other DNA-binding proteins and coregulators. The review emphasizes recent advances from biochemical and biophysical approaches, as well as insights from genomic studies that detect ETS-factor occupancy in living cells. PMID:21548782
Kang, Bo; Zhou, Yanwen; Zheng, Min; Wang, Ying-Jie
A ligand-activated transcription factor aryl hydrocarbon receptor (AhR) is recently revealed to play a key role in embryogenesis and tumorigenesis (Feng et al. , Safe et al. ) and 2-(1'H-indole-3'-carbonyl)-thiazole-4-carboxylic acid methyl ester (ITE) (Song et al. ) is an endogenous AhR ligand that possesses anti-tumor activity. In order to gain insights into how ITE acts via the AhR in embryogenesis and tumorigenesis, we analyzed the genome-wide transcriptional profiles of the following three groups of cells: the human glioblastoma U87 parental cells, U87 tumor sphere cells treated with vehicle (DMSO) and U87 tumor sphere cells treated with ITE. Here, we provide the details of the sample gathering strategy and show the quality controls and the analyses associated with our gene array data deposited into the Gene Expression Omnibus (GEO) under the accession code of GSE67986.
Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.
Full Text Available Calcium is a universal second messenger that plays an important role in regulatory processes in eukaryotic cells. To understand calcium-dependent signaling in malaria parasites, we analyzed transcriptional responses of Plasmodium falciparum to two calcium ionophores (A23187 and ionomycin that cause redistribution of intracellular calcium within the cytoplasm. While ionomycin induced a specific transcriptional response defined by up- or downregulation of a narrow set of genes, A23187 caused a developmental arrest in the schizont stage. In addition, we observed a dramatic decrease of mRNA levels of the transcripts encoded by the apicoplast genome during the exposure of P. falciparum to both calcium ionophores. Neither of the ionophores caused any disruptions to the DNA replication or the overall apicoplast morphology. This suggests that the mRNA downregulation reflects direct inhibition of the apicoplast gene transcription. Next, we identify a nuclear encoded protein with a calcium binding domain (EF-hand that is localized to the apicoplast. Overexpression of this protein (termed PfACBP1 in P. falciparum cells mediates an increased resistance to the ionophores which suggests its role in calcium-dependent signaling within the apicoplast. Our data indicate that the P. falciparum apicoplast requires calcium-dependent signaling that involves a novel protein PfACBP1.
Full Text Available Hypoxia-inducible factors (HIFs are a family of evolutionary conserved alpha-beta heterodimeric transcription factors that induce a wide range of genes in response to low oxygen tension. Molecular mechanisms that mediate oxygen-dependent HIF regulation operate at the level of the alpha subunit, controlling protein stability, subcellular localization, and transcriptional coactivator recruitment. We have conducted an unbiased genome-wide RNA interference (RNAi screen in Drosophila cells aimed to the identification of genes required for HIF activity. After 3 rounds of selection, 30 genes emerged as critical HIF regulators in hypoxia, most of which had not been previously associated with HIF biology. The list of genes includes components of chromatin remodeling complexes, transcription elongation factors, and translational regulators. One remarkable hit was the argonaute 1 (ago1 gene, a central element of the microRNA (miRNA translational silencing machinery. Further studies confirmed the physiological role of the miRNA machinery in HIF-dependent transcription. This study reveals the occurrence of novel mechanisms of HIF regulation, which might contribute to developing novel strategies for therapeutic intervention of HIF-related pathologies, including heart attack, cancer, and stroke.
Full Text Available Hypoxia-inducible factors (HIFs are a family of evolutionary conserved alpha-beta heterodimeric transcription factors that induce a wide range of genes in response to low oxygen tension. Molecular mechanisms that mediate oxygen-dependent HIF regulation operate at the level of the alpha subunit, controlling protein stability, subcellular localization, and transcriptional coactivator recruitment. We have conducted an unbiased genome-wide RNA interference (RNAi screen in Drosophila cells aimed to the identification of genes required for HIF activity. After 3 rounds of selection, 30 genes emerged as critical HIF regulators in hypoxia, most of which had not been previously associated with HIF biology. The list of genes includes components of chromatin remodeling complexes, transcription elongation factors, and translational regulators. One remarkable hit was the argonaute 1 (ago1 gene, a central element of the microRNA (miRNA translational silencing machinery. Further studies confirmed the physiological role of the miRNA machinery in HIF-dependent transcription. This study reveals the occurrence of novel mechanisms of HIF regulation, which might contribute to developing novel strategies for therapeutic intervention of HIF-related pathologies, including heart attack, cancer, and stroke.
Constance Qiao Xin Yeo
Full Text Available p53 tumor suppressor maintains genomic stability, typically acting through cell-cycle arrest, senescence, and apoptosis. We discovered a function of p53 in preventing conflicts between transcription and replication, independent of its canonical roles. p53 deficiency sensitizes cells to Topoisomerase (Topo II inhibitors, resulting in DNA damage arising spontaneously during replication. Topoisomerase IIα (TOP2A-DNA complexes preferentially accumulate in isogenic p53 mutant or knockout cells, reflecting an increased recruitment of TOP2A to regulate DNA topology. We propose that p53 acts to prevent DNA topological stress originating from transcription during the S phase and, therefore, promotes normal replication fork progression. Consequently, replication fork progression is impaired in the absence of p53, which is reversed by transcription inhibition. Pharmacologic inhibition of transcription also attenuates DNA damage and decreases Topo-II-DNA complexes, restoring cell viability in p53-deficient cells. Together, our results demonstrate a function of p53 that may underlie its role in tumor suppression.
Full Text Available Abstract Background Heat shock transcriptional factors (Hsfs play a crucial role in plant responses to biotic and abiotic stress conditions and in plant growth and development. Apple (Malus domestica Borkh is an economically important fruit tree whose genome has been fully sequenced. So far, no detailed characterization of the Hsf gene family is available for this crop plant. Results A genome-wide analysis was carried out in Malus domestica to identify heat shock transcriptional factor (Hsf genes, named MdHsfs. Twenty five MdHsfs were identified and classified in three main groups (class A, B and C according to the structural characteristics and to the phylogenetic comparison with Arabidopsis thaliana and Populus trichocarpa. Chromosomal duplications were analyzed and segmental duplications were shown to have occurred more frequently in the expansion of Hsf genes in the apple genome. Furthermore, MdHsfs transcripts were detected in several apple organs, and expression changes were observed by quantitative real-time PCR (qRT-PCR analysis in developing flowers and fruits as well as in leaves, harvested from trees grown in the field and exposed to the naturally increased temperatures. Conclusions The apple genome comprises 25 full length Hsf genes. The data obtained from this investigation contribute to a better understanding of the complexity of the Hsf gene family in apple, and provide the basis for further studies to dissect Hsf function during development as well as in response to environmental stimuli.
Yi, Peishan; Li, Wei; Ou, Guangshuo
The nematode Caenorhabditis elegans has been a powerful model system for biomedical research in the past decades, however, the efficient genetic tools are still demanding for gene knockout, knock-in or conditional gene mutations. Transcription activator-like effector nucleases (TALENs) that comprise a sequence-specific DNA-binding domain fused to a FokI nuclease domain facilitate the targeted genome editing in various cell types or organisms. Here we summarize the recent progresses and protocols using TALENs in C. elegans that generate gene mutations and knock-ins in the germ line and the conditional gene knockout in somatic tissues.
Haakonsson, Anders Kristian; Stahl Madsen, Maria; Nielsen, Ronni; Sandelin, Albin; Mandrup, Susanne
Peroxisome proliferator-activated receptor γ (PPARγ) is a master regulator of adipocyte differentiation, and genome-wide studies indicate that it is involved in the induction of most adipocyte genes. Here we report, for the first time, the acute effects of the synthetic PPARγ agonist rosiglitazone on the transcriptional network of PPARγ in adipocytes. Treatment with rosiglitazone for 1 hour leads to acute transcriptional activation as well as repression of a number of genes as determined by genome-wide RNA polymerase II occupancy. Unlike what has been shown for many other nuclear receptors, agonist treatment does not lead to major changes in the occurrence of PPARγ binding sites. However, rosiglitazone promotes PPARγ occupancy at many preexisting sites, and this is paralleled by increased occupancy of the mediator subunit MED1. The increase in PPARγ and MED1 binding is correlated with an increase in transcription of nearby genes, indicating that rosiglitazone, in addition to activating the receptor, also promotes its association with DNA, and that this is causally linked to recruitment of mediator and activation of genes. Notably, both rosiglitazone-activated and -repressed genes are induced during adipogenesis. However, rosiglitazone-activated genes are markedly more associated with PPARγ than repressed genes and are highly dependent on PPARγ for expression in adipocytes. By contrast, repressed genes are associated with the other key adipocyte transcription factor CCAAT-enhancer binding proteinα (C/EBPα), and their expression is more dependent on C/EBPα. This suggests that the relative occupancies of PPARγ and C/EBPα are critical for whether genes will be induced or repressed by PPARγ agonist.
Grossman, Sharon R.; Zhang, Xiaolan; Wang, Li; Engreitz, Jesse; Melnikov, Alexandre; Rogov, Peter; Tewhey, Ryan; Isakova, Alina; Deplancke, Bart; Bernstein, Bradley E.; Mikkelsen, Tarjei S.; Lander, Eric S.
Enhancers regulate gene expression through the binding of sequence-specific transcription factors (TFs) to cognate motifs. Various features influence TF binding and enhancer function—including the chromatin state of the genomic locus, the affinities of the binding site, the activity of the bound TFs, and interactions among TFs. However, the precise nature and relative contributions of these features remain unclear. Here, we used massively parallel reporter assays (MPRAs) involving 32,115 natural and synthetic enhancers, together with high-throughput in vivo binding assays, to systematically dissect the contribution of each of these features to the binding and activity of genomic regulatory elements that contain motifs for PPARγ, a TF that serves as a key regulator of adipogenesis. We show that distinct sets of features govern PPARγ binding vs. enhancer activity. PPARγ binding is largely governed by the affinity of the specific motif site and higher-order features of the larger genomic locus, such as chromatin accessibility. In contrast, the enhancer activity of PPARγ binding sites depends on varying contributions from dozens of TFs in the immediate vicinity, including interactions between combinations of these TFs. Different pairs of motifs follow different interaction rules, including subadditive, additive, and superadditive interactions among specific classes of TFs, with both spatially constrained and flexible grammars. Our results provide a paradigm for the systematic characterization of the genomic features underlying regulatory elements, applicable to the design of synthetic regulatory elements or the interpretation of human genetic variation. PMID:28137873
Transcription activator-like effectors (TALEs) can be used as DNA-targeting modules by engineering their repeat domains to dictate user-selected sequence specificity. TALEs have been shown to function as site-specific transcriptional activators in a variety of cell types and organisms. TALE nucleases (TALENs), generated by fusing the FokI cleavage domain to TALE, have been used to create genomic double-strand breaks. The identity of the TALE repeat variable di-residues, their number, and their order dictate the DNA sequence specificity. Because TALE repeats are nearly identical, their assembly by cloning or even by synthesis is challenging and time consuming. Here, we report the development and use of a rapid and straightforward approach for the construction of designer TALE (dTALE) activators and nucleases with user-selected DNA target specificity. Using our plasmid set of 100 repeat modules, researchers can assemble repeat domains for any 14-nucleotide target sequence in one sequential restriction-ligation cloning step and in only 24 h. We generated several custom dTALEs and dTALENs with new target sequence specificities and validated their function by transient expression in tobacco leaves and in vitro DNA cleavage assays, respectively. Moreover, we developed a web tool, called idTALE, to facilitate the design of dTALENs and the identification of their genomic targets and potential off-targets in the genomes of several model species. Our dTALE repeat assembly approach along with the web tool idTALE will expedite genome-engineering applications in a variety of cell types and organisms including plants. © 2012 Springer Science+Business Media B.V.
Yun Peng eCao
Full Text Available The MYB family is one of the largest families of transcription factors in plants. Although some MYBs have been reported to play roles in secondary metabolism, no comprehensive study of the MYB family in Chinese pear (Pyrus bretschneideri Rehd. has been reported. In the present study, we performed genome-wide analysis of MYB genes in Chinese pear, designated as PbMYBs, including analyses of their phylogenic relationships, structures, chromosomal locations, promoter regions, GO annotations and collinearity. A total of 129 PbMYB genes were identified in the pear genome and were divided into 31 subgroups based on phylogenetic analysis. These PbMYBs were unevenly distributed among 16 chromosomes (total of 17 chromosomes. The occurrence of gene duplication events indicated that whole-genome duplication and segmental duplication likely played key roles in expansion of the PbMYB gene family. Ka/Ks analysis suggested that the duplicated PbMYBs mainly experienced purifying selection with restrictive functional divergence after the duplication events. Interspecies microsynteny analysis revealed maximum orthology between pear and peach, followed by plum and strawberry. Subsequently, the expression patterns of 20 PbMYB genes that may be involved in lignin biosynthesis according to their phylogenetic relationships were examined throughout fruit development. Among the twenty genes examined, PbMYB25 and PbMYB52 exhibited expression patterns consistent with the typical variations in the lignin content previously reported. Moreover, sub-cellular localization analysis revealed that two proteins PbMYB25 and PbMYB52 were localized to the nucleus. All together, PbMYB25 and PbMYB52 were inferred to be candidate genes involved in the regulation of lignin biosynthesis during the development of pear fruit. This study provides useful information for further functional analysis of the MYB gene family in pear.
Mathilde de Taffin
Full Text Available Collier, the single Drosophila COE (Collier/EBF/Olf-1 transcription factor, is required in several developmental processes, including head patterning and specification of muscle and neuron identity during embryogenesis. To identify direct Collier (Col targets in different cell types, we used ChIP-seq to map Col binding sites throughout the genome, at mid-embryogenesis. In vivo Col binding peaks were associated to 415 potential direct target genes. Gene Ontology analysis revealed a strong enrichment in proteins with DNA binding and/or transcription-regulatory properties. Characterization of a selection of candidates, using transgenic CRM-reporter assays, identified direct Col targets in dorso-lateral somatic muscles and specific neuron types in the central nervous system. These data brought new evidence that Col direct control of the expression of the transcription regulators apterous and eyes-absent (eya is critical to specifying neuronal identities. They also showed that cross-regulation between col and eya in muscle progenitor cells is required for specification of muscle identity, revealing a new parallel between the myogenic regulatory networks operating in Drosophila and vertebrates. Col regulation of eya, both in specific muscle and neuronal lineages, may illustrate one mechanism behind the evolutionary diversification of Col biological roles.
Nolte, Mark J; Wang, Ying; Deng, Jian Min; Swinton, Paul G; Wei, Caimiao; Guindani, Michele; Schwartz, Robert J; Behringer, Richard R
Transcriptional enhancers are genomic sequences bound by transcription factors that act together with basal transcriptional machinery to regulate gene transcription. Several high-throughput methods have generated large datasets of tissue-specific enhancer sequences with putative roles in developmental processes. However, few enhancers have been deleted from the genome to determine their roles in development. To understand the roles of two enhancers active in the mouse embryonic limb bud we deleted them from the genome. Although the genes regulated by these enhancers are unknown, they were selected because they were identified in a screen for putative limb bud-specific enhancers associated with p300, an acetyltransferase that participates in protein complexes that promote active transcription, and because the orthologous human enhancers (H1442 and H280) drive distinct lacZ expression patterns in limb buds of embryonic day (E) 11.5 transgenic mice. We show that the orthologous mouse sequences, M1442 and M280, regulate dynamic expression in the developing limb. Although significant transcriptional differences in enhancer-proximal genes in embryonic limb buds accompany the deletion of M1442 and M280 no gross limb malformations during embryonic development were observed, demonstrating that M1442 and M280 are not required for mouse limb development. However, M280 is required for the development and/or maintenance of body size; M280 mice are significantly smaller than controls. M280 also harbors an "ultraconserved" sequence that is identical between human, rat, and mouse. This is the first report of a phenotype resulting from the deletion of an ultraconserved element. These studies highlight the importance of determining enhancer regulatory function by experiments that manipulate them in situ and suggest that some of an enhancer's regulatory capacities may be developmentally tolerated rather than developmentally required.
Junges, Ângela; Boldo, Juliano Tomazzoni; Souza, Bárbara Kunzler; Guedes, Rafael Lucas Muniz; Sbaraini, Nicolau; Kmetzsch, Lívia; Thompson, Claudia Elizabeth; Staats, Charley Christian; de Almeida, Luis Gonzaga Paula; de Vasconcelos, Ana Tereza Ribeiro; Vainstein, Marilene Henning; Schrank, Augusto
Fungal chitin metabolism involves diverse processes such as metabolically active cell wall maintenance, basic nutrition, and different aspects of virulence. Chitinases are enzymes belonging to the glycoside hydrolase family 18 (GH18) and 19 (GH19) and are responsible for the hydrolysis of β-1,4-linkages in chitin. This linear homopolymer of N-acetyl-β-D-glucosamine is an essential constituent of fungal cell walls and arthropod exoskeletons. Several chitinases have been directly implicated in structural, morphogenetic, autolytic and nutritional activities of fungal cells. In the entomopathogen Metarhizium anisopliae, chitinases are also involved in virulence. Filamentous fungi genomes exhibit a higher number of chitinase-coding genes than bacteria or yeasts. The survey performed in the M. anisopliae genome has successfully identified 24 genes belonging to glycoside hydrolase family 18, including three previously experimentally determined chitinase-coding genes named chit1, chi2 and chi3. These putative chitinases were classified based on domain organization and phylogenetic analysis into the previously described A, B and C chitinase subgroups, and into a new subgroup D. Moreover, three GH18 proteins could be classified as putative endo-N-acetyl-β-D-glucosaminidases, enzymes that are associated with deglycosylation and were therefore assigned to a new subgroup E. The transcriptional profile of the GH18 genes was evaluated by qPCR with RNA extracted from eight culture conditions, representing different stages of development or different nutritional states. The transcripts from the GH18 genes were detected in at least one of the different M. anisopliae developmental stages, thus validating the proposed genes. Moreover, not all members from the same chitinase subgroup presented equal patterns of transcript expression under the eight distinct conditions studied. The determination of M. anisopliae chitinases and ENGases and a more detailed study concerning the enzymes
Full Text Available Fungal chitin metabolism involves diverse processes such as metabolically active cell wall maintenance, basic nutrition, and different aspects of virulence. Chitinases are enzymes belonging to the glycoside hydrolase family 18 (GH18 and 19 (GH19 and are responsible for the hydrolysis of β-1,4-linkages in chitin. This linear homopolymer of N-acetyl-β-D-glucosamine is an essential constituent of fungal cell walls and arthropod exoskeletons. Several chitinases have been directly implicated in structural, morphogenetic, autolytic and nutritional activities of fungal cells. In the entomopathogen Metarhizium anisopliae, chitinases are also involved in virulence. Filamentous fungi genomes exhibit a higher number of chitinase-coding genes than bacteria or yeasts. The survey performed in the M. anisopliae genome has successfully identified 24 genes belonging to glycoside hydrolase family 18, including three previously experimentally determined chitinase-coding genes named chit1, chi2 and chi3. These putative chitinases were classified based on domain organization and phylogenetic analysis into the previously described A, B and C chitinase subgroups, and into a new subgroup D. Moreover, three GH18 proteins could be classified as putative endo-N-acetyl-β-D-glucosaminidases, enzymes that are associated with deglycosylation and were therefore assigned to a new subgroup E. The transcriptional profile of the GH18 genes was evaluated by qPCR with RNA extracted from eight culture conditions, representing different stages of development or different nutritional states. The transcripts from the GH18 genes were detected in at least one of the different M. anisopliae developmental stages, thus validating the proposed genes. Moreover, not all members from the same chitinase subgroup presented equal patterns of transcript expression under the eight distinct conditions studied. The determination of M. anisopliae chitinases and ENGases and a more detailed study
Salentijn Elma MJ
Full Text Available Abstract Background Celiac disease (CD is caused by an uncontrolled immune response to gluten, a heterogeneous mixture of wheat storage proteins. The CD-toxicity of these proteins and their derived peptides is depending on the presence of specific T-cell epitopes (9-mer peptides; CD epitopes that mediate the stimulation of HLA-DQ2/8 restricted T-cells. Next to the thoroughly characterized major T-cell epitopes derived from the α-gliadin fraction of gluten, γ-gliadin peptides are also known to stimulate T-cells of celiac disease patients. To pinpoint CD-toxic γ-gliadins in hexaploid bread wheat, we examined the variation of T-cell epitopes involved in CD in γ-gliadin transcripts of developing bread wheat grains. Results A detailed analysis of the genetic variation present in γ-gliadin transcripts of bread wheat (T. aestivum, allo-hexaploid, carrying the A, B and D genome, together with genomic γ-gliadin sequences from ancestrally related diploid wheat species, enabled the assignment of sequence variants to one of the three genomic γ-gliadin loci, Gli-A1, Gli-B1 or Gli-D1. Almost half of the γ-gliadin transcripts of bread wheat (49% was assigned to locus Gli-D1. Transcripts from each locus differed in CD epitope content and composition. The Gli-D1 transcripts contained the highest frequency of canonical CD epitope cores (on average 10.1 per transcript followed by the Gli-A1 transcripts (8.6 and the Gli-B1 transcripts (5.4. The natural variants of the major CD epitope from γ-gliadins, DQ2-γ-I, showed variation in their capacity to induce in vitro proliferation of a DQ2-γ-I specific and HLA-DQ2 restricted T-cell clone. Conclusions Evaluating the CD epitopes derived from γ-gliadins in their natural context of flanking protein variation, genome specificity and transcript frequency is a significant step towards accurate quantification of the CD toxicity of bread wheat. This approach can be used to predict relative levels of CD toxicity of
Full Text Available Changes in nucleotide sequences, or mutations, accumulate from generation to generation in the genomes of all living organisms. The mutations can be advantageous, deleterious, or neutral. The goal of this project is to determine the amount of advantageous mutations it takes to get human (Homo sapiens DNA from the DNA of genetically distinct organisms. We do this by collecting the genomic data of such organisms, and estimating the amount of mutations it takes to transform yeast (Saccharomyces cerevisiae DNA to the DNA of a human. We calculate the typical number of mutations occurring annually through the organism's average life span and the average mutation rate. This allows us to determine the total number of mutations as well as the probability of advantageous mutations. Not surprisingly, this probability proves to be fairly small. A more precise estimate can be determined by accounting for the differences in the chromosomal structure and phenomena like horizontal gene transfer.
Yi Jiang; Biao Zeng; Hainan Zhao; Mei Zhang; Shaojun Xie; Jinsheng Lai
Transcription factors (TFs) are important regulators of gene expression.To better understand TFencoding genes in maize (Zea mays L.),a genome-wide TF prediction was performed using the updated B73 reference genome.A total of 2 298 TF genes were identified,which can be classified into 56 families.The largest family,known as the MYB superfamily,comprises 322 MYB and MYB-related TF genes.The expression patterns of 2014 (87.64％) TF genes were examined using RNA-seq data,which resulted in the identification of a subset of TFs that are specifically expressed in particular tissues (including root,shoot,leaf,ear,tassel and kernel).Similarly,98 kernel-specific TF genes were further analyzed,and it was observed that 29 of the kernel-specific genes were preferentially expressed in the early kernel developmental stage,while 69 of the genes were expressed in the late kernel developmental stage.Identification of these TFs,particularly the tissue-specific ones,provides important information for the understanding of development and transcriptional regulation of maize.
Genetic screening identified a suppressor of ros1-1, a mutant of REPRESSOR OF SILENCING1 (ROS1; encoding a DNA demethylation protein). The suppressor is a mutation in the gene encoding the largest subunit of replication factor C (RFC1). This mutation of RFC1 reactivates the unlinked 35S-NPTII transgene, which is silenced in ros1 and also increases expression of the pericentromeric Athila retrotransposons named transcriptional silent information in a DNA methylationindependent manner. rfc1 is more sensitive than the wild type to the DNA-damaging agent methylmethane sulphonate and to the DNA inter- and intra- cross-linking agent cisplatin. The rfc1 mutant constitutively expresses the G2/M-specific cyclin CycB1;1 and other DNA repair-related genes. Treatment with DNA-damaging agents mimics the rfc1 mutation in releasing the silenced 35S-NPTII, suggesting that spontaneously induced genomic instability caused by the rfc1 mutation might partially contribute to the released transcriptional gene silencing (TGS). The frequency of somatic homologous recombination is significantly increased in the rfc1 mutant. Interestingly, ros1 mutants show increased telomere length, but rfc1 mutants show decreased telomere length and reduced expression of telomerase. Our results suggest that RFC1 helps mediate genomic stability and TGS in Arabidopsis thaliana. © 2010 American Society of Plant Biologists.
Full Text Available NAC proteins constitute one of the largest groups of plant-specific transcription factors and are known to play essential roles in various developmental processes. They are also important in plant responses to stresses such as drought, soil salinity, cold, and heat, which adversely affect growth. The current knowledge regarding the distribution of NAC proteins in plant lineages comes from relatively small samplings from the available data. In the present study, we broadened the number of plant species containing the NAC family origin and evolution to shed new light on the evolutionary history of this family in angiosperms. A comparative genome analysis was performed on 24 land plant species, and NAC ortholog groups were identified by means of bidirectional BLAST hits. Large NAC gene families are found in those species that have experienced more whole-genome duplication events, pointing to an expansion of the NAC family with divergent functions in flowering plants. A total of 3,187 NAC transcription factors that clustered into six major groups were used in the phylogenetic analysis. Many orthologous groups were found in the monocot and eudicot lineages, but only five orthologous groups were found between P. patens and each representative taxa of flowering plants. These groups were called basal orthologous groups and likely expanded into more recent taxa to cope with their environmental needs. This analysis on the angiosperm NAC family represents an effort to grasp the evolutionary and functional diversity within this gene family while providing a basis for further functional research on vascular plant gene families.
Pereira-Santana, Alejandro; Alcaraz, Luis David; Castaño, Enrique; Sanchez-Calderon, Lenin; Sanchez-Teyer, Felipe; Rodriguez-Zapata, Luis
NAC proteins constitute one of the largest groups of plant-specific transcription factors and are known to play essential roles in various developmental processes. They are also important in plant responses to stresses such as drought, soil salinity, cold, and heat, which adversely affect growth. The current knowledge regarding the distribution of NAC proteins in plant lineages comes from relatively small samplings from the available data. In the present study, we broadened the number of plant species containing the NAC family origin and evolution to shed new light on the evolutionary history of this family in angiosperms. A comparative genome analysis was performed on 24 land plant species, and NAC ortholog groups were identified by means of bidirectional BLAST hits. Large NAC gene families are found in those species that have experienced more whole-genome duplication events, pointing to an expansion of the NAC family with divergent functions in flowering plants. A total of 3,187 NAC transcription factors that clustered into six major groups were used in the phylogenetic analysis. Many orthologous groups were found in the monocot and eudicot lineages, but only five orthologous groups were found between P. patens and each representative taxa of flowering plants. These groups were called basal orthologous groups and likely expanded into more recent taxa to cope with their environmental needs. This analysis on the angiosperm NAC family represents an effort to grasp the evolutionary and functional diversity within this gene family while providing a basis for further functional research on vascular plant gene families. PMID:26569117
Describes the introduction of virtual, or digital, reference service at the University of New Brunswick libraries. Highlights include analyzing transcripts from LIVE (Library Information in a Virtual Environment); reference question types; ACRL (Association of College and Research Libraries) information literacy competency standards; and the Big 6…
Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.
Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525
Dosch, E; Zöller, B; Redmann-Müller, I; Nanda, I; Schmid, M; Viciano-Gofferge, A; Jungwirth, C
The chicken interferon consensus sequence binding protein (ChICSBP) gene spans over 9 kb of DNA and consists, as its murine homolog, of nine exons. The first untranslated exon was identified by 5'-RACE technology. The second exon contains the translation initiation codon. Canonical consensus splice sites are found on every exon/intron junction. The introns are generally smaller than their mammalian counterparts. The ChICSBP and ChIRF-1 genes have been mapped by fluorescence in situ hybridization to different microchromosomes. The transcription start site has been mapped by primer extension. Inspection of the DNA sequence of a genomic clone containing the first exon and the region 1700-bp upstream revealed several potential cisregulatory elements of transcription. The ChICSBP mRNA is induced by recombinant ChIFN type I and ChIFN-gamma. A palindromic IFN regulatory element (pIRE) with high sequence homology to gamma activation site (GAS) sequences was functionally required in transient transfection assays for the induction of transcription by ChIFN-gamma.
Antti Ylip(a)(a); Olli Yli-Harja; Wei Zhang; Matti Nykter
Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.
Konermann, Silvana; Brigham, Mark D; Trevino, Alexandro E; Joung, Julia; Abudayyeh, Omar O; Barcena, Clea; Hsu, Patrick D; Habib, Naomi; Gootenberg, Jonathan S; Nishimasu, Hiroshi; Nureki, Osamu; Zhang, Feng
Systematic interrogation of gene function requires the ability to perturb gene expression in a robust and generalizable manner. Here we describe structure-guided engineering of a CRISPR-Cas9 complex to mediate efficient transcriptional activation at endogenous genomic loci. We used these engineered Cas9 activation complexes to investigate single-guide RNA (sgRNA) targeting rules for effective transcriptional activation, to demonstrate multiplexed activation of ten genes simultaneously, and to upregulate long intergenic non-coding RNA (lincRNA) transcripts. We also synthesized a library consisting of 70,290 guides targeting all human RefSeq coding isoforms to screen for genes that, upon activation, confer resistance to a BRAF inhibitor. The top hits included genes previously shown to be able to confer resistance, and novel candidates were validated using individual sgRNA and complementary DNA overexpression. A gene expression signature based on the top screening hits correlated with markers of BRAF inhibitor resistance in cell lines and patient-derived samples. These results collectively demonstrate the potential of Cas9-based activators as a powerful genetic perturbation technology.
Hart, D; Frerichs, G N; Rambaut, A; Onions, D E
The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates.
Cao, Minh Duc; Allison, Lloyd; Dix, Trevor
Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
Gene targeting is a powerful genome engineering tool that can be used for a variety of biotechnological applications. Genomic double-strand DNA breaks generated by engineered site-specific nucleases can stimulate gene targeting. Hybrid nucleases are composed of DNA binding module and DNA cleavage module. Zinc Finger Nucleases were used to generate double-strand DNA breaks but it suffers from failures and lack of reproducibility. The transcription activator–like effectors (TALEs) from plant pathogenic Xanthomonas contain a unique type of DNA-binding domain that bind specific DNA targets. The purpose of this study is to generate novel sequence specific nucleases by fusing a de novo engineered Hax3 TALE-based DNA binding domain to a FokI cleavage domain. Our data show that the de novo engineered TALE nuclease can bind to its target sequence and create double-strand DNA breaks in vitro. We also show that the de novo engineered TALE nuclease is capable of generating double-strand DNA breaks in its target sequence in vivo, when transiently expressed in Nicotiana benthamiana leaves. In conclusion, our data demonstrate that TALE-based hybrid nucleases can be tailored to bind a user-selected DNA sequence and generate site-specific genomic double-strand DNA breaks. TALE-based hybrid nucleases hold much promise as powerful molecular tools for gene targeting applications.
Sollier, Julie; Stork, Caroline Townsend; García-Rubio, María L; Paulsen, Renee D; Aguilera, Andrés; Cimprich, Karlene A
R-loops, consisting of an RNA-DNA hybrid and displaced single-stranded DNA, are physiological structures that regulate various cellular processes occurring on chromatin. Intriguingly, changes in R-loop dynamics have also been associated with DNA damage accumulation and genome instability; however, the mechanisms underlying R-loop-induced DNA damage remain unknown. Here we demonstrate in human cells that R-loops induced by the absence of diverse RNA processing factors, including the RNA/DNA helicases Aquarius (AQR) and Senataxin (SETX), or by the inhibition of topoisomerase I, are actively processed into DNA double-strand breaks (DSBs) by the nucleotide excision repair endonucleases XPF and XPG. Surprisingly, DSB formation requires the transcription-coupled nucleotide excision repair (TC-NER) factor Cockayne syndrome group B (CSB), but not the global genome repair protein XPC. These findings reveal an unexpected and potentially deleterious role for TC-NER factors in driving R-loop-induced DNA damage and genome instability.
Sollier, Julie; Stork, Caroline Townsend; García-Rubio, María L.; Paulsen, Renee D.; Aguilera, Andrés; Cimprich, Karlene A.
Summary R-loops, consisting of an RNA-DNA hybrid and displaced single-stranded DNA, are physiological structures that regulate various cellular processes occurring on chromatin. Intriguingly, changes in R-loop dynamics have also been associated with DNA damage accumulation and genome instability, however the mechanisms underlying R-loop induced DNA damage remain unknown. Here we demonstrate in human cells that R-loops induced by the absence of diverse RNA processing factors, including the RNA/DNA helicases Aquarius (AQR) and Senataxin (SETX), or by the inhibition of topoisomerase I, are actively processed into DNA double-strand breaks (DSBs) by the nucleotide excision repair endonucleases XPF and XPG. Surprisingly, DSB formation requires the transcription-coupled nucleotide excision repair (TC-NER) factor Cockayne syndrome group B (CSB), but not the global genome repair protein XPC. These findings reveal an unexpected and potentially deleterious role for TC-NER factors in driving R-loop-induced DNA damage and genome instability. PMID:25435140
Kyrpides, Nikos C.; Markowitz, Victor M.
Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.
Masuda, Takao; Sato, Yoko; Huang, Yu-Lun; Koi, Satoshi; Takahata, Tatsuro; Hasegawa, Atsuhiko; Kawai, Gota; Kannagi, Mari
Retroviral reverse transcription is accomplished by sequential strand-transfers of partial cDNA intermediates copied from viral genomic RNA. Here, we revealed an unprecedented role of 5′-end guanosine (G) of HIV-1 genomic RNA for reverse transcription. Based on current consensus for HIV-1 transcription initiation site, HIV-1 transcripts possess a single G at 5′-ends (G1-form). However, we found that HIV-1 transcripts with additional Gs at 5′-ends (G2- and G3-forms) were abundantly expressed in infected cells by using alternative transcription initiation sites. The G2- and G3-forms were also detected in the virus particle, although the G1-form predominated. To address biological impact of the 5′-G number, we generated HIV clone DNA to express the G1-form exclusively by deleting the alternative initiation sites. Virus produced from the clone showed significantly higher strand-transfer of minus strong-stop cDNA (-sscDNA). The in vitro assay using synthetic HIV-1 RNAs revealed that the abortive forms of -sscDNA were abundantly generated from the G3-form RNA, but dramatically reduced from the G1-form. Moreover, the strand-transfer of -sscDNA from the G1-form was prominently stimulated by HIV-1 nucleocapsid. Taken together, our results demonstrated that the 5′-G number that corresponds to HIV-1 transcription initiation site was critical for successful strand-transfer of -sscDNA during reverse transcription. PMID:26631448
Philippe, Nicolas; Bou Samra, Elias; Boureux, Anthony; Mancheron, Alban; Rufflé, Florence; Bai, Qiang; De Vos, John; Rivals, Eric; Commes, Thérèse
Recent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads. Here, we applied an integrated bioinformatics approach that combines DGE tags, RNA-Seq, tiling array expression data and species-comparison to explore new transcriptional regions and their specific biological features, particularly tissue expression or conservation. We analysed tags from a large DGE data set (designated as 'TranscriRef'). We then annotated 750,000 tags that were uniquely mapped to the human genome according to Ensembl. We retained transcripts originating from both DNA strands and categorized tags corresponding to protein-coding genes, antisense, intronic- or intergenic-transcribed regions and computed their overlap with annotated non-coding transcripts. Using this bioinformatics approach, we identified ∼34,000 novel transcribed regions located outside the boundaries of known protein-coding genes. As demonstrated using sequencing data from human pluripotent stem cells for biological validation, the method could be easily applied for the selection of tissue-specific candidate transcripts. DigitagCT is available at http://cractools.gforge.inria.fr/softwares/digitagct.
Mikalsen, B; Fosby, B; Wang, J; Hammarström, C; Bjaerke, H; Lundström, M; Kasprzycka, M; Scott, H; Line, P-D; Haraldsen, G
Transcriptome analyses of organ transplants have until now usually focused on whole tissue samples containing activation profiles from different cell populations. Here, we enriched endothelial cells from rat cardiac allografts and isografts, establishing their activation profile at baseline and on days 2, 3 and 4 after transplantation. Modulated transcripts were assigned to three categories based on their regulation profile in allografts and isografts. Categories A and B contained the majority of transcripts and showed similar regulation in both graft types, appearing to represent responses to surgical trauma. By contrast, category C contained transcripts that were partly allograft-specific and to a large extent associated with interferon-gamma-responsiveness. Several transcripts were verified by immunohistochemical analysis of graft lesions, among them the matricellular protein periostin, which was one of the most highly upregulated transcripts but has not been associated with transplantation previously. In conclusion, the majority of the differentially expressed genes in graft endothelial cells are affected by the transplantation procedure whereas relatively few are associated with allograft rejection.
Wernersson, Rasmus; Frogne, Thomas; Rescan, Claude
transcript, which has recently been reported to be among the highest expressed transcripts in human pancreatic beta cells and its protein indicated as a novel autoantigen in Type 1 Diabetes. Results: Through RNA sequencing and variant specific qPCR analyses we demonstrate that the true abundance of INS-IGF2...... proteomics analysis we could not demonstrate INS-IGF2 protein in samples of human islets nor in EndoC-βH1. Conclusions: Sequence features, such as fusion transcripts spanning multiple genes can lead to unexpected results in gene expression analysis, and care must be taken in generating and interpreting...
Singh, Anil Kumar; Sharma, Vishal; Pal, Awadhesh Kumar; Acharya, Vishal; Ahuja, Paramvir Singh
NAC [no apical meristem (NAM), Arabidopsis thaliana transcription activation factor [ATAF1/2] and cup-shaped cotyledon (CUC2)] proteins belong to one of the largest plant-specific transcription factor (TF) families and play important roles in plant development processes, response to biotic and abiotic cues and hormone signalling. Our genome-wide analysis identified 110 StNAC genes in potato encoding for 136 proteins, including 14 membrane-bound TFs. The physical map positions of StNAC genes on 12 potato chromosomes were non-random, and 40 genes were found to be distributed in 16 clusters. The StNAC proteins were phylogenetically clustered into 12 subgroups. Phylogenetic analysis of StNACs along with their Arabidopsis and rice counterparts divided these proteins into 18 subgroups. Our comparative analysis has also identified 36 putative TNAC proteins, which appear to be restricted to Solanaceae family. In silico expression analysis, using Illumina RNA-seq transcriptome data, revealed tissue-specific, biotic, abiotic stress and hormone-responsive expression profile of StNAC genes. Several StNAC genes, including StNAC072 and StNAC101that are orthologs of known stress-responsive Arabidopsis RESPONSIVE TO DEHYDRATION 26 (RD26) were identified as highly abiotic stress responsive. Quantitative real-time polymerase chain reaction analysis largely corroborated the expression profile of StNAC genes as revealed by the RNA-seq data. Taken together, this analysis indicates towards putative functions of several StNAC TFs, which will provide blue-print for their functional characterization and utilization in potato improvement.
Liu, Yi; Han, Dali; Han, Yixing; Yan, Zheng; Xie, Bin; Li, Jing; Qiao, Nan; Hu, Haiyang; Khaitovich, Philipp; Gao, Yuan; Han, Jing-Dong J
Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone H3 lysine 4 trimethylation (H3K4me3)'s ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. We generated 14,013,757 sequence tags by H3K4me3 ChIP-Seq and obtained 17,322,358 paired end reads for mRNA, and 10,698,419 short reads for sRNA from the macaque brain. By integrating these data with genomic sequence features and extending and improving a state-of-the-art TSS prediction algorithm, we ab initio predicted and verified 17,933 of previously electronically annotated TSSs at 500-bp resolution. We also predicted approximately 10,000 novel TSSs. These provide an important rich resource for close examination of the species-specific transcript structures and transcription regulations in the Rhesus macaque genome. Our approach exemplifies a relatively inexpensive way to generate a reasonably reliable TSS map for a large genome. It may serve as a guiding example for similar genome annotation efforts targeted at other model organisms.
Full Text Available BACKGROUND: The androgen receptor (AR is a steroid-activated transcription factor that binds at specific DNA locations and plays a key role in the etiology of prostate cancer. While numerous studies have identified a clear connection between AR binding and expression of target genes for a limited number of loci, high-throughput elucidation of these sites allows for a deeper understanding of the complexities of this process. METHODOLOGY/PRINCIPAL FINDINGS: We have mapped 189 AR occupied regions (ARORs and 1,388 histone H3 acetylation (AcH3 loci to a 3% continuous stretch of human genomic DNA using chromatin immunoprecipitation (ChIP microarray analysis. Of 62 highly reproducible ARORs, 32 (52% were also marked by AcH3. While the number of ARORs detected in prostate cancer cells exceeded the number of nearby DHT-responsive genes, the AcH3 mark defined a subclass of ARORs much more highly associated with such genes -- 12% of the genes flanking AcH3+ARORs were DHT-responsive, compared to only 1% of genes flanking AcH3-ARORs. Most ARORs contained enhancer activities as detected in luciferase reporter assays. Analysis of the AROR sequences, followed by site-directed ChIP, identified binding sites for AR transcriptional coregulators FoxA1, CEBPbeta, NFI and GATA2, which had diverse effects on endogenous AR target gene expression levels in siRNA knockout experiments. CONCLUSIONS/SIGNIFICANCE: We suggest that only some ARORs function under the given physiological conditions, utilizing diverse mechanisms. This diversity points to differential regulation of gene expression by the same transcription factor related to the chromatin structure.
Genome-wide microarray analysis (Affymetrix array) was used (i) to determine whether only one gene, the cytochrome P450 enzyme Cyp6g1, is differentially transcribed in dichlorodiphenyltrichloroethane (DDT)-resistant vs. -susceptible Drosophila; and (ii) to profile common genes differentially transcribed across a DDT-resistant field isolate [Rst(2)DDTWisconsin] and a laboratory DDT-selected population [Rst(2)DDT91-R]. Statistical analysis (ANOVA model) identified 158 probe sets that were diffe...
Cruz-Plancarte, Indira; Cazares, Adrián; Guarneros, Gabriel
Previously, a collection of virulent phages infecting Pseudomonas aeruginosa was isolated from open water reservoirs and residual waters. Here, we described the comparative genomics of a set of five related phages from the collection, the physical structure of the genome, the structural proteomics of the virion, and the transcriptional program of archetypal phage PaMx41. The phage genomes were closely associated with each other and with those of two other P. aeruginosa phages, 119X and PaP2, which were previously filed in the databases. Overall, the genomes were approximately 43 kb, harboring 53 conserved open reading frames (ORFs) and three short ORFs in indel regions and containing 45% GC content. The genome of PaMx41 was further characterized as a linear, terminally redundant DNA molecule. A total of 16 ORFs were associated with putative functions, including nucleic acid metabolism, morphogenesis, and lysis, and eight virion proteins were identified through mass spectrometry. However, the coding sequences without assigned functions represent 70% of the ORFs. The PaMx41 transcription program was organized in early, middle, and late expressed genomic modules, which correlated with regions containing functionally related genes. The high genomic conservation among these distantly isolated phages suggests that these viruses undergo selective pressure to remain unchanged. The 119X lineage represents a unique set of phages that corresponds to a novel phage group. The features recognized in the genomes and the broad host range of clinical strains suggest that these phages are candidates for therapy applications.
Full Text Available Microsatellites or simple sequence repeats (SSRs are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0% were the most common, followed by di-nucleotide (26.9% and hexa-nucleotide motifs (15.1%. The motif AG (16.7% was most abundant among these SSRs, while motifs AAG (6.6%, AAT (5.0%, and TAG (2.2% were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0% of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.
Liu, Sheng-Rui; Li, Wen-Yang; Long, Dang; Hu, Chun-Gen; Zhang, Jin-Zhi
Microsatellites or simple sequence repeats (SSRs) are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0%) were the most common, followed by di-nucleotide (26.9%) and hexa-nucleotide motifs (15.1%). The motif AG (16.7%) was most abundant among these SSRs, while motifs AAG (6.6%), AAT (5.0%), and TAG (2.2%) were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0%) of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.
Yount, Boyd; Roberts, Rhonda S.; Lindesmith, Lisa; Baric, Ralph S.
Live virus vaccines provide significant protection against many detrimental human and animal diseases, but reversion to virulence by mutation and recombination has reduced appeal. Using severe acute respiratory syndrome coronavirus as a model, we engineered a different transcription regulatory circuit and isolated recombinant viruses. The transcription network allowed for efficient expression of the viral transcripts and proteins, and the recombinant viruses replicated to WT levels. Recombinant genomes were then constructed that contained mixtures of the WT and mutant regulatory circuits, reflecting recombinant viruses that might occur in nature. Although viable viruses could readily be isolated from WT and recombinant genomes containing homogeneous transcription circuits, chimeras that contained mixed regulatory networks were invariantly lethal, because viable chimeric viruses were not isolated. Mechanistically, mixed regulatory circuits promoted inefficient subgenomic transcription from inappropriate start sites, resulting in truncated ORFs and effectively minimize viral structural protein expression. Engineering regulatory transcription circuits of intercommunicating alleles successfully introduces genetic traps into a viral genome that are lethal in RNA recombinant progeny viruses. regulation | systems biology | vaccine design
Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.
Full Text Available Ultraviolet (UV radiation from sunlight represents a constant threat to genome stability by generating modified DNA bases such as cyclobutane pyrimidine dimers (CPD and pyrimidine-pyrimidone (6-4 photoproducts (6-4PP. If unrepaired, these lesions can have deleterious effects, including skin cancer. Mammalian cells are able to neutralize UV-induced photolesions through nucleotide excision repair (NER. The NER pathway has multiple components including seven xeroderma pigmentosum (XP proteins (XPA to XPG and numerous auxiliary factors, including ataxia telangiectasia and Rad3-related (ATR protein kinase and RCC1 like domain (RLD and homologous to the E6-AP carboxyl terminus (HECT domain containing E3 ubiquitin protein ligase 2 (HERC2. In this review we highlight recent data on the transcriptional and posttranslational regulation of NER activity.
Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M
AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.
Full Text Available Bovine spongiform encephalopathy (BSE is a fatal, transmissible, neurodegenerative disease of cattle. To date, the disease process is still poorly understood. In this study, brain tissue samples from animals naturally infected with BSE were analysed to identify differentially regulated genes using Affymetrix GeneChip Bovine Genome Arrays. A total of 230 genes were shown to be differentially regulated and many of these genes encode proteins involved in immune response, apoptosis, cell adhesion, stress response and transcription. Seventeen genes are associated with the endoplasmic reticulum (ER and 10 of these 17 genes are involved in stress related responses including ER chaperones, Grp94 and Grp170. Western blotting analysis showed that another ER chaperone, Grp78, was up-regulated in BSE. Up-regulation of these three chaperones strongly suggests the presence of ER stress and the activation of the unfolded protein response (UPR in BSE. The occurrence of ER stress was also supported by changes in gene expression for cytosolic proteins, such as the chaperone pair of Hsp70 and DnaJ. Many genes associated with the ubiquitin-proteasome pathway and the autophagy-lysosome system were differentially regulated, indicating that both pathways might be activated in response to ER stress. A model is presented to explain the mechanisms of prion neurotoxicity using these ER stress related responses. Clustering analysis showed that the differently regulated genes found from the naturally infected BSE cases could be used to predict the infectious status of the samples experimentally infected with BSE from the previous study and vice versa. Proof-of-principle gene expression biomarkers were found to represent BSE using 10 genes with 94% sensitivity and 87% specificity.
Close Timothy J
Full Text Available Abstract Background Rice and barley are both members of Poaceae (grass family but have a marked difference in salt tolerance. The molecular mechanism underlying this difference was previously unexplored. This study employs a comparative genomics approach to identify analogous and contrasting gene expression patterns between rice and barley. Results A hierarchical clustering approach identified several interesting expression trajectories among rice and barley genotypes. There were no major conserved expression patterns between the two species in response to salt stress. A wheat salt-stress dataset was queried for comparison with rice and barley. Roughly one-third of the salt-stress responses of barley were conserved with wheat while overlap between wheat and rice was minimal. These results demonstrate that, at transcriptome level, rice is strikingly different compared to the more closely related barley and wheat. This apparent lack of analogous transcriptional programs in response to salt stress is further highlighted through close examination of genes associated with root growth and development. Conclusion The analysis provides support for the hypothesis that conservation of transcriptional signatures in response to environmental cues depends on the genetic similarity among the genotypes within a species, and on the phylogenetic distance between the species.
Ueki, Toshiyuki; Lovley, Derek R
Geobacter species play important roles in bioremediation of contaminated environments and in electricity production from waste organic matter in microbial fuel cells. To better understand physiology of Geobacter species, expression and function of citrate synthase, a key enzyme in the TCA cycle that is important for organic acid oxidation in Geobacter species, was investigated. Geobacter sulfurreducens did not require citrate synthase for growth with hydrogen as the electron donor and fumarate as the electron acceptor. Expression of the citrate synthase gene, gltA, was repressed by a transcription factor under this growth condition. Functional and comparative genomics approaches, coupled with genetic and biochemical assays, identified a novel transcription factor termed HgtR that acts as a repressor for gltA. Further analysis revealed that HgtR is a global regulator for genes involved in biosynthesis and energy generation in Geobacter species. The hgtR gene was essential for growth with hydrogen, during which hgtR expression was induced. These findings provide important new insights into the mechanisms by which Geobacter species regulate their central metabolism under different environmental conditions.
Cvijovic, M.; Olivares Hernandez, Roberto; Agren, R.
models. Systematic analysis of biological processes by means of modelling and simulations has made the identification of metabolic networks and prediction of metabolic capabilities under different conditions possible. For facilitating such systemic analysis, we have developed the BioMet Toolbox, a web......-based resource for stoichiometric analysis and for integration of transcriptome and interactome data, thereby exploiting the capabilities of genome-scale metabolic models. The BioMet Toolbox provides an effective user-friendly way to perform linear programming simulations towards maximized or minimized growth...... rates, substrate uptake rates and metabolic production rates by detecting relevant fluxes, simulate single and double gene deletions or detect metabolites around which major transcriptional changes are concentrated. These tools can be used for high-throughput in silico screening and allows fully...
Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.
Song, Giltae; Dickins, Benjamin J A; Demeter, Janos; Engel, Stacia; Gallagher, Jennifer; Choe, Kisurb; Dunn, Barbara; Snyder, Michael; Cherry, J Michael
The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.
Thanks to high-throughput experiments, biological conditions can be investigated at both the entire genomic and transcriptomic levels. In addition, protein-protein interaction (PPI) data are widely available for well-studied organisms, such as human. In this chapter, we will present an integrative approach that makes use of these data to find the PPI module involving the key regulated transcription factors shared by a number of given conditions. These conditions could be for instance different cancer types. Briefly, for the studied conditions, we need to identify commonly affected chromosomal regions subjected to copy number alterations together with the identification of differentially expressed list of genes in each condition. Transcription factor activity will be inferred from these regulated gene lists. Then, we will define TFs, for which the activity could be explained by an associative effect of both loci copy number alteration and gene expression levels of their coding genes. PPI networks could be mined, afterwards, using appropriate algorithms to find the significant module that connect those TFs together. This module could be viewed as the minimal connected network of TFs, the regulation of which is shared between the investigated conditions.
Wenger, A. M.
The human genome encodes 1500-2000 different transcription factors (TFs). ChIP-seq is revealing the global binding profiles of a fraction of TFs in a fraction of their biological contexts. These data show that the majority of TFs bind directly next to a large number of context-relevant target genes, that most binding is distal, and that binding is context specific. Because of the effort and cost involved, ChIP-seq is seldom used in search of novel TF function. Such exploration is instead done using expression perturbation and genetic screens. Here we propose a comprehensive computational framework for transcription factor function prediction. We curate 332 high-quality nonredundant TF binding motifs that represent all major DNA binding domains, and improve cross-species conserved binding site prediction to obtain 3.3 million conserved, mostly distal, binding site predictions. We combine these with 2.4 million facts about all human and mouse gene functions, in a novel statistical framework, in search of enrichments of particular motifs next to groups of target genes of particular functions. Rigorous parameter tuning and a harsh null are used to minimize false positives. Our novel PRISM (predicting regulatory information from single motifs) approach obtains 2543 TF function predictions in a large variety of contexts, at a false discovery rate of 16%. The predictions are highly enriched for validated TF roles, and 45 of 67 (67%) tested binding site regions in five different contexts act as enhancers in functionally matched cells.
Wen, Chang-Long; Cheng, Qing; Zhao, Liqun; Mao, Aijun; Yang, Jingjing; Yu, Shuancang; Weng, Yiqun; Xu, Yong
Cucumber is vulnerable to many foliage diseases. Recent studies reported cloning of candidate genes for several diseases in cucumber; however, the exact defence mechanisms remain unclear. Dof genes have been shown to play significant roles in plant growth, development, and responses to biotic and abiotic stresses. Dof genes coding for plant-specific transcription factors can promote large-scale expression of defence-related genes at whole genome level. The genes in the family have been identified and characterized in several plant species, but not in cucumber. In the present study, we identified 36 CsDof members from the cucumber draft genomes which could be classified into eight groups. The proportions of the CsDof family genes, duplication events, chromosomal locations, cis-elements and miRNA target sites were comprehensively investigated. Consequently, we analysed the expression patterns of CsDof genes in specific tissues and their response to two biotic stresses (watermelon mosaic virus and downy mildew). These results indicated that CsDof may be involved in resistance to biotic stresses in cucumber.
David L Bernick
Full Text Available Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus uncovered a novel RNA-targeting variant of the CRISPR system potentially unique to archaea. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genomic and transcriptomic view of the CRISPR arrays across six diverse species within the crenarchaeal genus Pyrobaculum. We present transcriptional data from each of four species in the genus (P. aerophilum, P. islandicum, P. calidifontis, P. arsenaticum, analyzing mature CRISPR-associated small RNA abundance from over 20 arrays. Within the genus, there is remarkable conservation of CRISPR array structure, as well as unique features that are have not been studied in other archaeal systems. These unique features include: a nearly invariant CRISPR promoter, conservation of direct repeat families, the 5' polarity of CRISPR-associated small RNA abundance, and a novel CRISPR-specific association with homologues of nurA and herA. These analyses provide a genus-level evolutionary perspective on archaeal CRISPR systems, broadening our understanding beyond existing non-comparative model systems.
Hazelett, Dennis J; Lakeland, Daniel L; Weiss, Joseph B
A new method was developed for identifying novel transcription factor regulatory targets based on calculating Local Affinity Density. Techniques from the signal-processing field were used, in particular the Hann digital filter, to calculate the relative binding affinity of different regions based on previously published in vitro binding data. To illustrate this approach, the complete genomes of Drosophila melanogaster and D.pseudoobscura were analyzed for binding sites of the homeodomain proteinc Tinman, an essential heart development gene in both Drosophila and Mouse. The significant binding regions were identified relative to genomic background and assigned to putative target genes. Valid candidates common to both species of Drosophila were selected as a test of conservation. The new method was more sensitive than cluster searches for conserved binding motifs with respect to positive identification of known Tinman targets. Our Local Affinity Density method also identified a significantly greater proportion of Tinman-coexpressed genes than equivalent, optimized cluster searching. In addition, this new method predicted a significantly greater than expected number of genes with previously published RNAi phenotypes in the heart. Algorithms were implemented in Python, LISP, R and maxima, using MySQL to access locally mirrored sequence data from Ensembl (D.melanogaster release 4.3) and flybase (D.pseudoobscura). All code is licensed under GPL and freely available at http://www.ohsu.edu/cellbio/dev_biol_prog/affinitydensity/.
Luo, C; Lu, X; Stubbs, L; Kim, J
YY2 was originally identified due to its unusual similarity to the evolutionarily well conserved, zinc-finger gene YY1. In this study, we have determined the evolutionary origin and conservation of YY2 using comparative genomic approaches. Our results indicate that YY2 is a retroposed copy of YY1 that has been inserted into another gene locus named Mbtps2 (membrane-bound transcription factor protease site 2). This retroposition is estimated to have occurred after the divergence of placental mammals from other vertebrates based on the detection of YY2 only in the placental mammals. The N-terminal and C-terminal regions of YY2 have evolved under different selection pressures. The N-terminal region has evolved at a very fast pace with very limited functional constraints whereas the DNA-binding, C-terminal region still maintains very similar sequence structure as YY1 and is also well conserved among placental mammals. In situ hybridizations using different adult mouse tissues indicate that mouse YY2 is expressed at relatively low levels in Purkinje and granular cells of cerebellum, and neuronal cells of cerebrum, but at very high levels in testis. The expression levels of YY2 is much lower than YY1, but the overall spatial expression patterns are similar to those of Mbtps2, suggesting a possible shared transcriptional control between YY2 and Mbtps2. Taken together, the formation and evolution of YY2 represent a very unusual case where a transcription factor was first retroposed into another gene locus encoding a protease and survived with different selection schemes and expression patterns.
Damiani, A M; Jang, H K; Matsumura, T; Yokoyama, N; Miyazawa, T; Mikami, T
To map the transcripts encoding the equine herpesvirus type 4 (EHV-4) glycoproteins I (gI) and E (gE), transcriptional analyses were performed at the right part of the unique short segment of EHV-4 genome. The results revealed that the gI gene is encoded by a 1.6-kb transcript which is 3' coterminal with a 3.0-kb gD mRNA while the gE gene is encoded by two transcripts of 3.5- and 2.4-kb in size. The transcriptional patterns described in this study for the EHV-4 gI and gE are similar to those found in the equivalent region of herpes simplex virus type 1 and feline herpesvirus type 1. Characterization of EHV-4 gI and gE glycoprotein genes may facilitate future studies to define their roles in the EHV-4 infection.
Zhang, Yong; Zhang, Feng; Li, Xiaohong; Baller, Joshua A; Qi, Yiping; Starker, Colby G; Bogdanove, Adam J; Voytas, Daniel F
The ability to precisely engineer plant genomes offers much potential for advancing basic and applied plant biology. Here, we describe methods for the targeted modification of plant genomes using transcription activator-like effector nucleases (TALENs). Methods were optimized using tobacco (Nicotiana tabacum) protoplasts and TALENs targeting the acetolactate synthase (ALS) gene. Optimal TALEN scaffolds were identified using a protoplast-based single-strand annealing assay in which TALEN cleavage creates a functional yellow fluorescent protein gene, enabling quantification of TALEN activity by flow cytometry. Single-strand annealing activity data for TALENs with different scaffolds correlated highly with their activity at endogenous targets, as measured by high-throughput DNA sequencing of polymerase chain reaction products encompassing the TALEN recognition sites. TALENs introduced targeted mutations in ALS in 30% of transformed cells, and the frequencies of targeted gene insertion approximated 14%. These efficiencies made it possible to recover genome modifications without selection or enrichment regimes: 32% of tobacco calli generated from protoplasts transformed with TALEN-encoding constructs had TALEN-induced mutations in ALS, and of 16 calli characterized in detail, all had mutations in one allele each of the duplicate ALS genes (SurA and SurB). In calli derived from cells treated with a TALEN and a 322-bp donor molecule differing by 6 bp from the ALS coding sequence, 4% showed evidence of targeted gene replacement. The optimized reagents implemented in plant protoplasts should be useful for targeted modification of cells from diverse plant species and using a variety of means for reagent delivery.
Genome replication and transcription of Tomato spotted wilt virus (TSWV, genus Tospovirus ) follows in most aspects the general rules for negative strand RNA viruses with segmented genomes. One common feature is the occurrence of "cap snatching" during transcription initiation. During this process,
Zhukova, Anna; Fernandes, Luis Guilherme; Hugon, Perrine; Pappas, Christopher J.; Sismeiro, Odile; Coppée, Jean-Yves; Becavin, Christophe; Malabat, Christophe; Eshghi, Azad; Zhang, Jun-Jie; Yang, Frank X.; Picardeau, Mathieu
Leptospira are emerging zoonotic pathogens transmitted from animals to humans typically through contaminated environmental sources of water and soil. Regulatory pathways of pathogenic Leptospira spp. underlying the adaptive response to different hosts and environmental conditions remains elusive. In this study, we provide the first global Transcriptional Start Site (TSS) map of a Leptospira species. RNA was obtained from the pathogen Leptospira interrogans grown at 30°C (optimal in vitro temperature) and 37°C (host temperature) and selectively enriched for 5′ ends of native transcripts. A total of 2865 and 2866 primary TSS (pTSS) were predicted in the genome of L. interrogans at 30 and 37°C, respectively. The majority of the pTSSs were located between 0 and 10 nucleotides from the translational start site, suggesting that leaderless transcripts are a common feature of the leptospiral translational landscape. Comparative differential RNA-sequencing (dRNA-seq) analysis revealed conservation of most pTSS at 30 and 37°C. Promoter prediction algorithms allow the identification of the binding sites of the alternative sigma factor sigma 54. However, other motifs were not identified indicating that Leptospira consensus promoter sequences are inherently different from the Escherichia coli model. RNA sequencing also identified 277 and 226 putative small regulatory RNAs (sRNAs) at 30 and 37°C, respectively, including eight validated sRNAs by Northern blots. These results provide the first global view of TSS and the repertoire of sRNAs in L. interrogans. These data will establish a foundation for future experimental work on gene regulation under various environmental conditions including those in the host. PMID:28154810
Full Text Available The teosinte branched1/cycloidea/proliferating cell factor (TCP gene family is a plant-specific transcription factor that participates in the control of plant development by regulating cell proliferation. However, no report is currently available about this gene family in turnips (Brassica rapa ssp. rapa. In this study, a genome-wide analysis of TCP genes was performed in turnips. Thirty-nine TCP genes in turnip genome were identified and distributed on 10 chromosomes. Phylogenetic analysis clearly showed that the family was classified as two clades: class I and class II. Gene structure and conserved motif analysis showed that the same clade genes have similar gene structures and conserved motifs. The expression profiles of 39 TCP genes were determined through quantitative real-time PCR. Most CIN-type BrrTCP genes were highly expressed in leaf. The members of CYC/TB1 subclade are highly expressed in flower bud and weakly expressed in root. By contrast, class I clade showed more widespread but less tissue-specific expression patterns. Yeast two-hybrid data show that BrrTCP proteins preferentially formed heterodimers. The function of BrrTCP2 was confirmed through ectopic expression of BrrTCP2 in wild-type and loss-of-function ortholog mutant of Arabidopsis. Overexpression of BrrTCP2 in wild-type Arabidopsis resulted in the diminished leaf size. Overexpression of BrrTCP2 in triple mutants of tcp2/4/10 restored the leaf phenotype of tcp2/4/10 to the phenotype of wild type. The comprehensive analysis of turnip TCP gene family provided the foundation to further study the roles of TCP genes in turnips.
Matson, Eric G.; Rosenthal, Adam Z.; Zhang, Xinning; Leadbetter, Jared R.
ABSTRACT When prokaryotic cells acquire mutations, encounter translation-inhibiting substances, or experience adverse environmental conditions that limit their ability to synthesize proteins, transcription can become uncoupled from translation. Such uncoupling is known to suppress transcription of protein-encoding genes in bacteria. Here we show that the trace element selenium controls transcription of the gene for the selenocysteine-utilizing enzyme formate dehydrogenase (fdhFSec) through a translation-coupled mechanism in the termite gut symbiont Treponema primitia, a member of the bacterial phylum Spirochaetes. We also evaluated changes in genome-wide transcriptional patterns caused by selenium limitation and by generally uncoupling translation from transcription via antibiotic-mediated inhibition of protein synthesis. We observed that inhibiting protein synthesis in T. primitia influences transcriptional patterns in unexpected ways. In addition to suppressing transcription of certain genes, the expected consequence of inhibiting protein synthesis, we found numerous examples in which transcription of genes and operons is truncated far downstream from putative promoters, is unchanged, or is even stimulated overall. These results indicate that gene regulation in bacteria allows for specific post-initiation transcriptional responses during periods of limited protein synthesis, which may depend both on translational coupling and on unclassified intrinsic elements of protein-encoding genes. PMID:24222491
Sang Woo Seo
Full Text Available Three transcription factors (TFs, OxyR, SoxR, and SoxS, play a critical role in transcriptional regulation of the defense system for oxidative stress in bacteria. However, their full genome-wide regulatory potential is unknown. Here, we perform a genome-scale reconstruction of the OxyR, SoxR, and SoxS regulons in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 68 genes in 51 transcription units (TUs belong to these regulons. Among them, 48 genes showed more than 2-fold changes in expression level under single-TF-knockout conditions. This reconstruction expands the genome-wide roles of these factors to include direct activation of genes related to amino acid biosynthesis (methionine and aromatic amino acids, cell wall synthesis (lipid A biosynthesis and peptidoglycan growth, and divalent metal ion transport (Mn2+, Zn2+, and Mg2+. Investigating the co-regulation of these genes with other stress-response TFs reveals that they are independently regulated by stress-specific TFs.
Yun E Wang
Full Text Available Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcription have been identified in the D-loop, little is known about the characteristics of TFAM binding in its nonspecific packaging state. In addition, it is unclear whether TFAM also plays a role in the regulation of nuclear gene expression. Here we investigate these questions by using ChIP-seq to directly localize TFAM binding to DNA in human cells. Our results demonstrate that TFAM uniformly coats the whole mitochondrial genome, with no evidence of robust TFAM binding to the nuclear genome. Our study represents the first high-resolution assessment of TFAM binding on a genome-wide scale in human cells.
Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran
Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.
Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor
Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.
Nielsen, Ronni; Mandrup, Susanne
The recent advances in high-throughput sequencing combined with various other technologies have allowed detailed and genome-wide insight into the transcriptional networks that control adipogenesis. Chromatin immunoprecipitation (ChIP) combined with high-throughput sequencing (ChIP-seq) is one...
Clausing, Emanuel; Mayer, Andreas; Chanarat, Sittinan
foci. Interestingly, the DNA damage sensitivity of an rfa1 mutant was suppressed by bur1 mutation, further underscoring a functional link between these two protein complexes. The transcription elongation factor Bur1-Bur2 interacts with RPA and maintains genome integrity during DNA replication stress....
Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.
Peng, Sen; Dhruv, Harshil; Armstrong, Brock; Salhia, Bodour; Legendre, Christophe; Kiefer, Jeffrey; Parks, Julianna; Virk, Selene; Sloan, Andrew E; Ostrom, Quinn T; Barnholtz-Sloan, Jill S; Tran, Nhan L; Berens, Michael E
To elucidate molecular features associated with disproportionate survival of glioblastoma (GB) patients, we conducted deep genomic comparative analysis of a cohort of patients receiving standard therapy (surgery plus concurrent radiation and temozolomide); "GB outliers" were identified: long-term survivor of 33 months (LTS; n = 8) versus short-term survivor of 7 months (STS; n = 10). We implemented exome, RNA, whole genome sequencing, and DNA methylation for collection of deep genomic data from STS and LTS GB patients. LTS GB showed frequent chromosomal gains in 4q12 (platelet derived growth factor receptor alpha and KIT) and 12q14.1 (cyclin-dependent kinase 4), and deletion in 19q13.33 (BAX, branched chain amino-acid transaminase 2, and cluster of differentiation 33). STS GB showed frequent deletion in 9p11.2 (forkhead box D4-like 2 and aquaporin 7 pseudogene 3) and 22q11.21 (Hypermethylated In Cancer 2). LTS GB showed 2-fold more frequent copy number deletions compared with STS GB. Gene expression differences showed the STS cohort with altered transcriptional regulators: activation of signal transducer and activator of transcription (STAT)5a/b, nuclear factor-kappaB (NF-κB), and interferon-gamma (IFNG), and inhibition of mitogen-activated protein kinase (MAPK1), extracellular signal-regulated kinase (ERK)1/2, and estrogen receptor (ESR)1. Expression-based biological concepts prominent in the STS cohort include metabolic processes, anaphase-promoting complex degradation, and immune processes associated with major histocompatibility complex class I antigen presentation; the LTS cohort features genes related to development, morphogenesis, and the mammalian target of rapamycin signaling pathway. Whole genome methylation analyses showed that a methylation signature of 89 probes distinctly separates LTS from STS GB tumors. We posit that genomic instability is associated with longer survival of GB (possibly with vulnerability to standard therapy); conversely, genomic
Full Text Available The constitutive expression of the high-risk HPV E6 and E7 viral oncogenes is the major cause of cervical cancer. To comprehensively explore the composition of HPV16 early transcripts and their genomic annotation, cervical squamous epithelial tissues from 40 HPV16-infected patients were collected for analysis of papillomavirus oncogene transcripts (APOT. We observed different transcription patterns of HPV16 oncogenes in progression of cervical lesions to cervical cancer and identified one novel transcript. Multiple-integration events in the tissues of cervical carcinoma (CxCa are significantly more often than those of low-grade squamous intraepithelial lesions (LSIL and high-grade squamous intraepithelial lesions (HSIL. Moreover, most cellular genes within or near these integration sites are cancer-associated genes. Taken together, this study suggests that the multiple-integrations of HPV genome during persistent viral infection, which thereby alters the expression patterns of viral oncogenes and integration-related cellular genes, play a crucial role in progression of cervical lesions to cervix cancer.
The basic helix-loop-helix (bHLH) transcription factors are one of the largest families of gene regulatory proteins and play crucial roles in genetic, developmental and physiological processes in eukaryotes. Here, we conducted a survey of the Sus scrofa genome and identified 109 putative bHLH transcription factor members belonging to super-groups A, B, C, D, E, and F, respectively, while four members were orphan genes. We identified 6 most significantly enriched KEGG pathways and 116 most significant GO annotation categories. Further comprehensive surveys in human genome and other 12 medical databases identified 72 significantly enriched biological pathways with these 113 pig bHLH transcription factors. From the functional protein association network analysis 93 hub proteins were identified and 55 hub proteins created a tight network or a functional module within their protein families. Especially, there were 20 hub proteins found highly connected in the functional interaction network. The present study deepens our understanding and provided insights into the evolution and functional aspects of animal bHLH proteins and should serve as a solid foundation for further for analyses of specific bHLH transcription factors in the pig and other mammals.
Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.
Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor
Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.
Cong-Qing Wu; Hong-Hong Hu; Ya Zeng; Da-Cheng Liang; Ka-Bin Xie; Jian-Wei Zhang; Zhao-Hui Chu; Li-Zhong Xiong
Numerous studies have shown that array of transcription factors has a role in regulating plant responses to environmental stresses. Only a small portion of them however, have been identified or characterized.More than 2 300 putative transcription factors were predicted in the rice genome and more than half of them were supported by expressed sequences. With an attempt to identify novel transcription factors involved in the stress responses, a cDNA array containing 753 putative rice transcription factors was generated to analyze the transcript profiles of these genes under drought and salinity stresses and abscisic acid treatment at seedling stage of rice. About 80% of these transcription factors showed detectable levels of transcript in seedling leaves. A total of 18 up-regulated transcription factors and 29 down-regulated transcription factors were detected with the folds of changes from 2.0 to 20.5 in at least one stress treatment.Most of these stress-responsive genes have not been reported and the expression patterns for five genes under stress conditions were further analyzed by RNA gel blot analysis. These novel stress-responsive transcription factors provide new opportunities to study the regulation of gene expression in plants under stress conditions.
Iwata, A; Ueda, S; Ishihama, A; Hirai, K
The number of 132-bp tandem direct repeats within the long inverted repeat region of the Marek's disease virus type 1 (MDV1) genome increases concomitantly with the loss of oncogenicity during serial passages in cultured cells. Twelve clones carrying the 132-bp sequence were isolated from a cDNA library constructed from chicken embryo fibroblasts infected with the MDV1 Md5 strain. Through sequence analysis of a cDNA clone and primer extension analysis, the corresponding mRNA was found to be a linear transcript which included the two 132-bp tandem direct repeats. Two open reading frames were found in this transcript. One had a week homology with v-fms. The other should increase its size concomitantly with expansion of the 132-bp tandem direct repeat. PCR analysis of both cDNA clones and RNA gave amplified products which were as large as that produced from the genomic clone, indicating that a majority of mRNA from this region is composed of unspliced transcripts.
Anderson, Letícia; Pierce, Raymond J; Verjovski-Almeida, Sergio
Schistosoma mansoni is a human endoparasite with a complex life cycle that also infects an invertebrate mollusk intermediate host and exhibits many diverse phenotypes. Its complexity is reflected in a large genome and different transcriptome profiles specific to each life cycle stage. Epigenetic regulation of gene expression such as the post-translational modification of histones has a significant impact on phenotypes, and this information storage function resides primarily at histone tails, which results in a varied histone code. Evidence of transcription of the different histone families at all life stages of the parasite was detected by a survey of transcriptome databases; manual curation of each gene prediction at the genome sequence level showed errors in the coding sequences of three of them. The biogenesis of histones is coupled to DNA replication, and a detailed in silico analysis of the specialized machinery of histone mRNA processing in the S. mansoni genome reveals that it is as conserved as in other eukaryotes, consisting in transcription factors and stem-loop binding proteins which recognize the stem loop structure at the histone mRNA 3'UTR. Histone modifying enzymes (HMEs) such as histone acetyltransferases, methyltransferases and deacetylases (HDACs) have been described in S. mansoni, and their potential as new therapeutic targets was evidenced with the apoptotic phenotype that resulted from HDAC inhibition. However, the overall regulation of transcription coupled with gene expression profiles correlated to histone modifications has not yet been characterized. Besides the interaction of HMEs with histones, many factors involved in cellular processes are known to bind to histones, and were identified here by an in silico analysis of the S. mansoni genome. Knowledge of the histone families opens up perspectives for further studies that will lead to a better identification of their post-translational modifications, their gene regulation and to the
Full Text Available Rebound of HIV viremia after interruption of anti-retroviral therapy is due to the small population of CD4+ T cells that remain latently infected. HIV-1 transcription is the main process controlling post-integration latency. Regulation of HIV-1 transcription takes place at both initiation and elongation levels. Pausing of RNA polymerase II at the 5' end of HIV-1 transcribed region (5'HIV-TR, which is immediately downstream of the transcription start site, plays an important role in the regulation of viral expression. The activation of HIV-1 transcription correlates with the rearrangement of a positioned nucleosome located at this region. These two facts suggest that the 5'HIV-TR contributes to inhibit basal transcription of those HIV-1 proviruses that remain latently inactive. However, little is known about the cell elements mediating the repressive role of the 5'HIV-TR. We performed a genetic analysis of this phenomenon in Saccharomyces cerevisiae after reconstructing a minimal HIV-1 transcriptional system in this yeast. Unexpectedly, we found that the critical role played by the 5'HIV-TR in maintaining low levels of basal transcription in yeast is mediated by FACT, Spt6, and Chd1, proteins so far associated with chromatin assembly and disassembly during ongoing transcription. We confirmed that this group of factors plays a role in HIV-1 postintegration latency in human cells by depleting the corresponding human orthologs with shRNAs, both in HIV latently infected cell populations and in particular single-integration clones, including a latent clone with a provirus integrated in a highly transcribed gene. Our results indicate that chromatin reassembly factors participate in the establishment of the equilibrium between activation and repression of HIV-1 when it integrates into the human genome, and they open the possibility of considering these factors as therapeutic targets of HIV-1 latency.
Qibin Luo; Qing Zhou; Xiaomin Yu; Hongbin Lin; Songnian Hu; Jun Yu
MicroRNAs (miRNAs) are endogenous 22-nt RNAs, which play important regulatory roles by post-transcriptional gene silencing. A computational strategy has been developed for the identification of conserved miRNAs based on features of known metazoan miRNAs in red flour beetle (Tribolium castaneum), which is regarded as one of the major laboratory models of arthropods. Among 118 putative miRNAs, 47% and 53% of the predicted miRNAs from the red flour beetle are harbored by known protein-coding genes (intronic) and genes located outside (intergenic miRNA), respectively. There are 31 intronic miRNAs in the same transcriptional orientation as the host genes, which may share RNA polymerase Ⅱ and spliceosomal machinery with their host genes for their biogenesis. A hypothetical feedback model has been proposed based on the analysis of the relationship between intronic miRNAs and their host genes in the development of red flour beetle.
Salamov, Asaf; Grigoriev, Igor
Transcription factors (TFs) are proteins that regulate the transcription of genes, by binding to specific DNA sequences. Based on literature (Shelest, 2008; Weirauch and Hughes,2011) collected and manually curated list of DBD Pfam domains (in total 62 DBD domains) We looked for distribution of TFs in 395 fungal genomes plus additionally in plant genomes (Phytozome), prokaryotes(IMG), some animals/metazoans and protists genomes
Ramos, Miguel Jesus Nunes; Coito, João Lucas; Fino, Joana; Cunha, Jorge; Silva, Helena; de Almeida, Patrícia Gomes; Costa, Maria Manuela Ribeiro; Amâncio, Sara; Paulo, Octávio S; Rocheta, Margarida
RNA-seq of Vitis during early stages of bud development, in male, female and hermaphrodite flowers, identified new loci outside of annotated gene models, suggesting their involvement in sex establishment. The molecular mechanisms responsible for flower sex specification remain unclear for most plant species. In the case of V. vinifera ssp. vinifera, it is not fully understood what determines hermaphroditism in the domesticated subspecies and male or female flowers in wild dioecious relatives (Vitis vinifera ssp. sylvestris). Here, we describe a de novo assembly of the transcriptome of three flower developmental stages from the three Vitis vinifera flower types. The validation of de novo assembly showed a correlation of 0.825. The main goals of this work were the identification of V. v. sylvestris exclusive transcripts and the characterization of differential gene expression during flower development. RNA from several flower developmental stages was used previously to generate Illumina sequence reads. Through a sequential de novo assembly strategy one comprehensive transcriptome comprising 95,516 non-redundant transcripts was assembled. From this dataset 81,064 transcripts were annotated to V. v. vinifera reference transcriptome and 11,084 were annotated against V. v. vinifera reference genome. Moreover, we found 3368 transcripts that could not be mapped to Vitis reference genome. From all the non-redundant transcripts that were assembled, bioinformatics analysis identified 133 specific of V. v. sylvestris and 516 transcripts differentially expressed among the three flower types. The detection of transcription from areas of the genome not currently annotated suggests active transcription of previously unannotated genomic loci during early stages of bud development.
Leela, J Krishna; Syeda, Aisha H; Anupama, K; Gowrishankar, J
Two pathways of transcription termination, factor-independent and -dependent, exist in bacteria. The latter pathway operates on nascent transcripts that are not simultaneously translated and requires factors Rho, NusG, and NusA, each of which is essential for viability of WT Escherichia coli. NusG and NusA are also involved in antitermination of transcription at the ribosomal RNA operons, as well as in regulating the rates of transcription elongation of all genes. We have used a bisulfite-sensitivity assay to demonstrate genome-wide increase in the occurrence of RNA-DNA hybrids (R-loops), including from antisense and read-through transcripts, in a nusG missense mutant defective for Rho-dependent termination. Lethality associated with complete deficiency of Rho and NusG (but not NusA) was rescued by ectopic expression of an R-loop-helicase UvsW, especially so on defined growth media. Our results suggest that factor-dependent transcription termination subserves a surveillance function to prevent translation-uncoupled transcription from generating R-loops, which would block replication fork progression and therefore be lethal, and that NusA performs additional essential functions as well in E. coli. Prevention of R-loop-mediated transcription-replication conflicts by cotranscriptional protein engagement of nascent RNA is emerging as a unifying theme among both prokaryotes and eukaryotes.
Suchland, Robert J; Jeffrey, Brendan M; Xia, Minsheng; Bhatia, Ajay; Chu, Hencelyn G; Rockey, Daniel D; Stamm, Walter E
Clinical isolates of Chlamydia trachomatis that lack IncA on their inclusion membrane form nonfusogenic inclusions and have been associated with milder, subclinical infections in patients. The molecular events associated with the generation of IncA-negative strains and their roles in chlamydial sexually transmitted infections are not clear. We explored the biology of the IncA-negative strains by analyzing their genomic structure, transcription, and growth characteristics in vitro and in vivo in comparison with IncA-positive C. trachomatis strains. Three clinical samples were identified that contained a mixture of IncA-positive and -negative same-serovar C. trachomatis populations, and two more such pairs were found in serial isolates from persistently infected individuals. Genomic sequence analysis of individual strains from each of two serovar-matched pairs showed that these pairs were very similar genetically. In contrast, the genome sequence of an unmatched IncA-negative strain contained over 5,000 nucleotide polymorphisms relative to the genome sequence of a serovar-matched but otherwise unlinked strain. Transcriptional analysis, in vitro culture kinetics, and animal modeling demonstrated that IncA-negative strains isolated in the presence of a serovar-matched wild-type strain are phenotypically more similar to the wild-type strain than are IncA-negative strains isolated in the absence of a serovar-matched wild-type strain. These studies support a model suggesting that a change from an IncA-positive strain to the previously described IncA-negative phenotype may involve multiple steps, the first of which involves a translational inactivation of incA, associated with subsequent unidentified steps that lead to the observed decrease in transcript level, differences in growth rate, and differences in mouse infectivity.
Full Text Available BACKGROUND: Congenital hypothyroidism from thyroid dysgenesis (CHTD is predominantly a sporadic disease characterized by defects in the differentiation, migration or growth of thyroid tissue. Of these defects, incomplete migration resulting in ectopic thyroid tissue is the most common (up to 80%. Germinal mutations in the thyroid-related transcription factors NKX2.1, FOXE1, PAX-8, and NKX2.5 have been identified in only 3% of patients with sporadic CHTD. Moreover, a survey of monozygotic twins yielded a discordance rate of 92%, suggesting that somatic events, genetic or epigenetic, probably play an important role in the etiology of CHTD. METHODOLOGY/PRINCIPAL FINDINGS: To assess the role of somatic genetic or epigenetic processes in CHTD, we analyzed gene expression, genome-wide methylation, and structural genome variations in normal versus ectopic thyroid tissue. In total, 1011 genes were more than two-fold induced or repressed. Expression array was validated by quantitative real-time RT-PCR for 100 genes. After correction for differences in thyroid activation state, 19 genes were exclusively associated with thyroid ectopy, among which genes involved in embryonic development (e.g. TXNIP and in the Wnt pathway (e.g. SFRP2 and FRZB were observed. None of the thyroid related transcription factors (FOXE1, HHEX, NKX2.1, NKX2.5 showed decreased expression, whereas PAX8 expression was associated with thyroid activation state. Finally, the expression profile was independent of promoter and CpG island methylation and of structural genome variations. CONCLUSIONS/SIGNIFICANCE: This is the first integrative molecular analysis of ectopic thyroid tissue. Ectopic thyroids show a differential gene expression compared to that of normal thyroids, although molecular basis could not be defined. Replication of this pilot study on a larger cohort could lead to unraveling the elusive cause of defective thyroid migration during embryogenesis.
查向东; 周立志; 黄河胜; 刘兢; 徐康森
为研究蛇毒C型凝集素类蛋白的快速进化机制和结构功能关系,使用PCR技术扩增了若干编码C型凝集素类蛋白β链的cDNA分子以及agkisasin β的基因组DNA,并将这些扩增产物进行克隆和测序.对测序结果与试验过程中的具体条件进行了因果关系分析,并且进行点阵图比较和多序列比对.结果表明,可能存在"转录后同源重组"等转录后的事件,在蛇毒C型凝集素类蛋白的多样性上起着重要的作用.对于解释基因数目与蛋白质数目的差异这一后基因组时代的重要问题,具有一定的参考价值.首次报告蛇毒C型凝集素类蛋白的基因组DNA序列,其中未发现有内含子.%To better understand the accelerated evolution of snake venom C-type lectin-like proteins (CTL-like proteins) and to investigate the structure-function relationships, PCR was conducted to amplify cDNAs coding for the β chains of snake venom CTL-like proteins and the genomic DNA of agkisasin β. The reaction products were cloned and sequenced. The causal relationships between the sequences and the experimental conditions were established. Dot plot analysis and multiple alignments were also performed. The results suggested the existence of a post-transcriptional processing event that was essentially homologous recombination at the RNA level, which might play an important role in the diversity of snake venom CTL-like proteins. This inference would provide a novel perspective for explaining a challenging problem of the post-genomic era: the discrepancy between the limited number of genes and the large collection of cDNAs, which was most prominent with regard to certain snake venom proteins. The genomic DNA of a snake venom C-type lectin-like protein was elucidated and no introns were found in the coding region.
and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...
Erica Bree Rosenblum
Full Text Available Emerging infectious diseases are of great concern for both wildlife and humans. Several highly virulent fungal pathogens have recently been discovered in natural populations, highlighting the need for a better understanding of fungal-vertebrate host-pathogen interactions. Because most fungal pathogens are not fatal in the absence of other predisposing conditions, host-pathogen dynamics for deadly fungal pathogens are of particular interest. The chytrid fungus Batrachochytrium dendrobatidis (hereafter Bd infects hundreds of species of frogs in the wild. It is found worldwide and is a significant contributor to the current global amphibian decline. However, the mechanism by which Bd causes death in amphibians, and the response of the host to Bd infection, remain largely unknown. Here we use whole-genome microarrays to monitor the transcriptional responses to Bd infection in the model frog species, Silurana (Xenopus tropicalis, which is susceptible to chytridiomycosis. To elucidate the immune response to Bd and evaluate the physiological effects of chytridiomycosis, we measured gene expression changes in several tissues (liver, skin, spleen following exposure to Bd. We detected a strong transcriptional response for genes involved in physiological processes that can help explain some clinical symptoms of chytridiomycosis at the organismal level. However, we detected surprisingly little evidence of an immune response to Bd exposure, suggesting that this susceptible species may not be mounting efficient innate and adaptive immune responses against Bd. The weak immune response may be partially explained by the thermal conditions of the experiment, which were optimal for Bd growth. However, many immune genes exhibited decreased expression in Bd-exposed frogs compared to control frogs, suggesting a more complex effect of Bd on the immune system than simple temperature-mediated immune suppression. This study generates important baseline data for ongoing
Simon, Jeffrey A.; Kingston, Robert E.
Summary Polycomb repressive complexes are conserved chromatin regulators with key roles in multicellular development, stem cell biology, and cancer. New findings advance molecular understanding of how they target to sites of action, interact with and alter local chromatin to silence genes, and maintain silencing in successive generations of proliferating cells. Chromatin modification by Polycomb proteins provides an essential strategy for gene silencing in higher eukaryotes. Polycomb repressive complexes (PRCs) silence many key developmental regulators and are centrally integrated in the transcriptional circuitry of embryonic and adult stem cells. PRC2 trimethylates histone H3 on lysine-27 (H3-K27me3) and PRC1-type complexes ubiquitylate histone H2A and compact polynucleosomes. How PRCs and these signature activities are deployed to select and silence genomic targets is the subject of intense current investigation. We review recent advances on targeting, modulation, and functions of PRC1 and PRC2, and we consider progress on defining transcriptional steps impacted in Polycomb silencing. Key recent findings demonstrate PRC1 targeting independent of H3-K27me3 and emphasize nonenzymatic PRC1-mediated compaction. We also evaluate expanding connections between Polycomb machinery and non-coding RNAs. Exciting new studies supply the first systematic analyses of what happens to Polycomb complexes, and associated histone modifications, during the wholesale chromatin reorganizations that accompany DNA replication and mitosis. The stage is now set to reveal fundamental epigenetic mechanisms that determine how Polycomb target genes are silenced and how Polycomb silence is preserved through cell cycle progression. PMID:23473600
Full Text Available Polyploidization as the consequence of 2n gamete formation is a prominent mechanism in plant evolution. Studying its effects on the genome, and on genome expression, has both basic and applied interest. We crossed two diploid (2n = 2x = 16 Medicago sativa plants, a subsp. falcata seed parent, and a coerulea × falcata pollen parent that form a mixture of n and 2n eggs and pollen, respectively. Such a cross produced full-sib diploid and tetraploid (2n = 4x = 32 hybrids, the latter being the result of bilateral sexual polyploidization (BSP. These unique materials allowed us to investigate the effects of BSP, and to separate the effect of intraspecific hybridization from those of polyploidization by comparing 2x with 4x full sib progeny plants. Simple sequence repeat marker segregation demonstrated tetrasomic inheritance for all chromosomes but one, demonstrating that these neotetraploids are true autotetraploids. BSP brought about increased biomass, earlier flowering, higher seed set and weight, and larger leaves with larger cells. Microarray analyses with M. truncatula gene chips showed that several hundred genes, related to diverse metabolic functions, changed their expression level as a consequence of polyploidization. In addition, cytosine methylation increased in 2x, but not in 4x, hybrids. Our results indicate that sexual polyploidization induces significant transcriptional novelty, possibly mediated in part by DNA methylation, and phenotypic novelty that could underpin improved adaptation and reproductive success of tetraploid M. sativa with respect to its diploid progenitor. These polyploidy-induced changes may have promoted the adoption of tetraploid alfalfa in agriculture.
Tsai, M F; Lo, C F; van Hulten, M C; Tzeng, H F; Chou, C M; Huang, C J; Wang, C H; Lin, J Y; Vlak, J M; Kou, G H
The causative agent of white spot syndrome (WSS) is a large double-stranded DNA virus, WSSV, which is probably a representative of a new genus, provisionally called Whispovirus. From previously constructed WSSV genomic libraries of a Taiwan WSSV isolate, clones with open reading frames (ORFs) that encode proteins with significant homology to the class I ribonucleotide reductase large (RR1) and small (RR2) subunits were identified. WSSV rr1 and rr2 potentially encode 848 and 413 amino acids, respectively. RNA was isolated from WSSV-infected shrimp at different times after infection and Northern blot analysis with rr1- and rr2-specific riboprobes found major transcripts of 2.8 and 1.4 kb, respectively. 5' RACE showed that the major rr1 transcript started at a position of -84 (C) relative to the ATG translational start, while transcription of the rr2 gene started at nucleotide residue -68 (T). A consensus motif containing the transcriptional start sites for rr1 and rr2 was observed (TCAc/tTC). Northern blotting and RT-PCR showed that the transcription of rr1 and rr2 started 4-6 h after infection and continued for at least 60 h. The rr1 and rr2 genes thus appear to be WSSV "early genes."
Enriquez, Judith Guevarra
Content analysis has dominated computer-mediated communication and educational technology studies for some time, and a review of its practices applied to online corpus of data or messages is overdue. We are confronted with complexity given the various foci, nuances and models for theorising learning and applying methods. One common suggestion to…
Full Text Available The eye of the fruit fly Drosophila melanogaster provides a highly tractable genetic model system for the study of animal development, and many genes that regulate Drosophila eye formation have homologs implicated in human development and disease. Among these is the homeobox gene sine oculis (so, which encodes a homeodomain transcription factor (TF that is both necessary for eye development and sufficient to reprogram a subset of cells outside the normal eye field toward an eye fate. We have performed a genome-wide analysis of So binding to DNA prepared from developing Drosophila eye tissue in order to identify candidate direct targets of So-mediated transcriptional regulation, as described in our recent article . The data are available from NCBI Gene Expression Omnibus (GEO with the accession number GSE52943. Here we describe the methods, data analysis, and quality control of our So ChIP-seq dataset.
Ravlić, Sanda; Žučko, Jurica; Tanković, Mirta Smodlaka; Fafanđel, Maja; Bihari, Nevenka
Cytochrome P450 enzymes (CYPs) are essential components of cellular detoxification system. We identified and characterized seven new cytochrome P450 gene transcript clusters in the populations of bivalve mollusc Mytilus galloprovincialis from three different locations. The phylogenetic analysis identified all transcripts as clusters within the CYP4 branch. Identified clusters, each comprising a number of transcript variants, were designated CYP4Y1, Y2, Y3, Y4, Y5, Y6 and Y7. Transcript clusters CYP4Y2 and Y7, and CYP4Y5 and Y6 showed site specificity, while the transcript clusters CYP4Y1, Y3 and Y4 were present at all investigated locations. The comparison of transcripts deduced amino acid sequences with CYP4s from vertebrate and invertebrate species showed high conservation of the residues and domains essential to the putative function of the enzyme, as terminal ω-hydroxylation and prostaglandin hydroxylation. Our results suggest the great expansion of the CYP4Y cDNAs indicative of CYP4 proteins in the mussel M. galloprovincialis presumably as a response to different environmental conditions.
Tovar, Hugo; García-Herrera, Rodrigo; Espinal-Enríquez, Jesús; Hernández-Lemus, Enrique
Gene regulatory networks account for the delicate mechanisms that control gene expression. Under certain circumstances, gene regulatory programs may give rise to amplification cascades. Such transcriptional cascades are events in which activation of key-responsive transcription factors called master regulators trigger a series of gene expression events. The action of transcriptional master regulators is then important for the establishment of certain programs like cell development and differentiation. However, such cascades have also been related with the onset and maintenance of cancer phenotypes. Here we present a systematic implementation of a series of algorithms aimed at the inference of a gene regulatory network and analysis of transcriptional master regulators in the context of primary breast cancer cells. Such studies were performed in a highly curated database of 880 microarray gene expression experiments on biopsy-captured tissue corresponding to primary breast cancer and healthy controls. Biological function and biochemical pathway enrichment analyses were also performed to study the role that the processes controlled - at the transcriptional level - by such master regulators may have in relation to primary breast cancer. We found that transcription factors such as AGTR2, ZNF132, TFDP3 and others are master regulators in this gene regulatory network. Sets of genes controlled by these regulators are involved in processes that are well-known hallmarks of cancer. This kind of analyses may help to understand the most upstream events in the development of phenotypes, in particular, those regarding cancer biology.
Full Text Available The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organisation, transcription, various post-transcriptional processes and translation. In this study, the Transcriptional Interference Network (TIN hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighbouring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally-linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly-arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely-oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronised cascade of gene expression in functionally-linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular
Ortet, Philippe; De Luca, Gilles; Whitworth, David E; Barakat, Mohamed
Transcription factors (TFs) are DNA-binding proteins that regulate gene expression by activating or repressing transcription. Some have housekeeping roles, while others regulate the expression of specific genes in response to environmental change. The majority of TFs are multi-domain proteins, and they can be divided into families according to their domain organisation. There is a need for user-friendly, rigorous and consistent databases to allow researchers to overcome the inherent variability in annotation between genome sequences. P2TF (Predicted Prokaryotic Transcription Factors) is an integrated and comprehensive database relating to transcription factor proteins. The current version of the database contains 372,877 TFs from 1,987 completely sequenced prokaryotic genomes and 43 metagenomes. The database provides annotation, classification and visualisation of TF genes and their genetic context, providing researchers with a one-stop shop in which to investigate TFs. The P2TF database analyses TFs in both predicted proteomes and reconstituted ORFeomes, recovering approximately 3% more TF proteins than just screening predicted proteomes. Users are able to search the database with sequence or domain architecture queries, and resulting hits can be aligned to investigate evolutionary relationships and conservation of residues. To increase utility, all searches can be filtered by taxonomy, TF genes can be added to the P2TF cart, and gene lists can be exported for external analysis in a variety of formats. P2TF is an open resource for biologists, allowing exploration of all TFs within prokaryotic genomes and metagenomes. The database enables a variety of analyses, and results are presented for user exploration as an interactive web interface, which provides different ways to access and download the data. The database is freely available at http://www.p2tf.org/.
Full Text Available Objective – In order to better contextualize library data about patron satisfaction with reference services, we analyzed an existing corpus of chat transcripts. Having conducted a similar analysis in 2010, we also compared librarian behaviors over time. Methods – Drawing from the library literature, we identified a set of librarian behaviors closely associated with patron satisfaction. These behaviors include listening to and understanding patrons’ needs, inviting patrons to use the service again, and providing instruction or completing a search for patrons. Analysis of the chat transcripts included establishing a coding schema, applying these codes to individual chat transcripts, and analyzing these codes across the corpus of transcripts for frequency and correlation with other codes. The currently presented analysis used chat transcripts from the fall of 2013 and seeks changes in librarian behavior over time in order to gauge the success of establishing best practices and improving training standardization over the last three years. Results – The analysis shows that librarian behaviors have changed over time, pointing to what campus librarians are doing well, and that implementation of best practices at a campus level after the 2010 analysis may have increased these positive behaviors. The analysis also shows opportunities for further standardization and reinforcement of best practices. Conclusion – Qualitative analysis of already-collected data serves as a model for other units and suggests areas for process improvement, including enhanced coder training and code schema design. Further analysis of chat patrons’ questions is also warranted, including investigation of the relationship between subject- and location-specific questions and referrals.
Jun-Li FENG; Shao-Ning CHEN; Xiang-Shan TANG; Xian-Feng DING; Zhi-You DU; Ji-Shuang CHEN
A real-time RT-PCR procedure using the green fluorescent dye SYBR Green I was developed for determining the absolute and relative copies of cucumber mosaic virus (CMV) genomic RNAs contained in purified virions. Primers specific to each CMV ORF were designed and selected. Sequences were then amplified with length varying from 61 to 153 bp. Using dilution series of CMV genome RNAs prepared by in vitro transcription as the standard samples, a good linear correlation was observed between their threshold cycle (Ct)values and the logarithms of the initial template amounts. The copies of genomic RNA 1, RNA 2,RNA 3 and the subgenomic RNA 4 in CMV virions were quantified by this method, and the ratios were about Our work is the first report concerning the relative amounts of different RNA fragments in CMV virions as a virus with tripartite genome.
Wei, Wei; Hu, Yang; Cui, Meng-Yuan; Han, Yong-Tao; Gao, Kuan; Feng, Jia-Yue
Plant-specific TEOSINTE BRANCHED 1, CYCLOIDEA, and PROLIFERATING CELL FACTORS (TCP) transcription factors play versatile functions in multiple processes of plant growth and development. However, no systematic study has been performed in strawberry. In this study, 19 FvTCP genes were identified in the diploid woodland strawberry (Fragaria vesca) accession Heilongjiang-3. Phylogenetic analysis suggested that the FvTCP genes were classified into two main classes, with the second class further divided into two subclasses, which was supported by the exon-intron organizations and the conserved motif structures. Promoter analysis revealed various cis-acting elements related to growth and development, hormone and/or stress responses. We analyzed FvTCP gene transcript accumulation patterns in different tissues and fruit developmental stages. Among them, 12 FvTCP genes exhibited distinct tissue-specific transcript accumulation patterns. Eleven FvTCP genes were down-regulated in different fruit developmental stages, while five FvTCP genes were up-regulated. Transcripts of FvTCP genes also varied with different subcultural propagation periods and were induced by hormone treatments and biotic and abiotic stresses. Subcellular localization analysis showed that six FvTCP-GFP fusion proteins showed distinct localizations in Arabidopsis mesophyll protoplasts. Notably, transient over-expression of FvTCP9 in strawberry fruits dramatically affected the expression of a series of genes implicated in fruit development and ripening. Taken together, the present study may provide the basis for functional studies to reveal the role of this gene family in strawberry growth and development. PMID:28066489
Yin, Haifeng; Nichols, Teresa D.; Horowitz, Jonathan M.
The Sp-family of transcription factors is comprised by nine members, Sp1-9, that share a highly-conserved DNA-binding domain. Sp2 is a poorly characterized member of this transcription factor family that is widely expressed in murine and human cell lines yet exhibits little DNA-binding or trans-activation activity in these settings. As a prelude to the generation of a “knock-out” mouse strain, we isolated a mouse Sp2 cDNA and performed a detailed analysis of Sp2 transcription in embryonic and adult mouse tissues. We report that (1) the 5′ untranslated region of Sp2 is subject to alternative splicing, (2) Sp2 transcription is regulated by at least two promoters that differ in their cell-type specificity, (3) one Sp2 promoter is highly active in nine mammalian cell lines and strains and is regulated by at least five discrete stimulatory and inhibitory elements, (4) a variety of sub-genomic messages are synthesized from the Sp2 locus in a tissue- and cell type-specific fashion and these transcripts have the capacity to encode a novel partial-Sp2 protein, and (5) RNA in situ hybridization assays indicate that Sp2 is widely expressed during mouse embryogenesis, particularly in the embryonic brain, and robust Sp2 expression occurs in neurogenic regions of the post-natal and adult brain. PMID:20353838
Full Text Available Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant
Full Text Available Abstract Background Identifying cis-regulatory elements is crucial to understanding gene expression, which highlights the importance of the computational detection of overrepresented transcription factor binding sites (TFBSs in coexpressed or coregulated genes. However, this is a challenging problem, especially when considering higher eukaryotic organisms. Results We have developed a method, named TFM-Explorer, that searches for locally overrepresented TFBSs in a set of coregulated genes, which are modeled by profiles provided by a database of position weight matrices. The novelty of the method is that it takes advantage of spatial conservation in the sequence and supports multiple species. The efficiency of the underlying algorithm and its robustness to noise allow weak regulatory signals to be detected in large heterogeneous data sets. Conclusion TFM-Explorer provides an efficient way to predict TFBS overrepresentation in related sequences. Promising results were obtained in a variety of examples in human, mouse, and rat genomes. The software is publicly available at http://bioinfo.lifl.fr/TFM-Explorer.
Donati, Andrew J; Jeon, Jeong-Min; Sangurdekar, Dipen; So, Jae-Seong; Chang, Woo-Suk
The rhizobial bacterium Bradyrhizobium japonicum functions as a nitrogen-fixing symbiont of the soybean plant (Glycine max). Plants are capable of producing an oxidative burst, a rapid proliferation of reactive oxygen species (ROS), as a defense mechanism against pathogenic and symbiotic bacteria. Therefore, B. japonicum must be able to resist such a defense mechanism to initiate nodulation. In this study, paraquat, a known superoxide radical-inducing agent, was used to investigate this response. Genome-wide transcriptional profiles were created for both prolonged exposure (PE) and fulminant shock (FS) conditions. These profiles revealed that 190 and 86 genes were up- and downregulated for the former condition, and that 299 and 105 genes were up- and downregulated for the latter condition, respectively (>2.0-fold; P ROS scavenging enzymes, such as superoxide dismutase and catalase, were not detected, suggesting constitutive expression of those genes by endogenous ROS. Various physiological tests, including exopolysaccharide (EPS), cellular protein, and motility characterization, were performed to corroborate the gene expression data. The results suggest that B. japonicum responds to tolerable oxidative stress during PE through enhanced motility, increased translational activity, and EPS production, in addition to the expression of genes involved in global stress responses, such as chaperones and sigma factors.
Full Text Available Abstract Background The Complete Arabidopsis Transcript MicroArray (CATMA initiative combines the efforts of laboratories in eight European countries 1 to deliver gene-specific sequence tags (GSTs for the Arabidopsis research community. The CATMA initiative offers the power and flexibility to regularly update the GST collection according to evolving knowledge about the gene repertoire. These GST amplicons can easily be reamplified and shared, subsets can be picked at will to print dedicated arrays, and the GSTs can be cloned and used for other functional studies. This ongoing initiative has already produced approximately 24,000 GSTs that have been made publicly available for spotted microarray printing and RNA interference. Results GSTs from the CATMA version 2 repertoire (CATMAv2, created in 2002 were mapped onto the gene models from two independent Arabidopsis nuclear genome annotation efforts, TIGR5 and PSB-EuGène, to consolidate a list of genes that were targeted by previously designed CATMA tags. A total of 9,027 gene models were not tagged by any amplified CATMAv2 GST, and 2,533 amplified GSTs were no longer predicted to tag an updated gene model. To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform. To complete the CATMA repertoire, all 9,027 gene models for which no GST had yet been designed were processed with an adjusted version of the Specific Primer and Amplicon Design Software (SPADS. A total of 5,756 novel GSTs were designed and amplified by PCR from genomic DNA. Together with the pre-existing GST collection, this new addition constitutes the CATMAv3 repertoire. It comprises 30,343 unique amplified sequences that tag 24,202 and 23,009 protein-encoding nuclear gene models in the TAIR6 and Eu
Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone H3 lysine 4 trimethylation (H3K4me3)’s ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. We generated...
Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X
PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Freitas, F Zanolli; Bertolini, M C
Glycogen synthase, an enzyme involved in glycogen biosynthesis, is regulated by phosphorylation and by the allosteric ligand glucose-6-phosphate (G6P). In addition, enzyme levels can be regulated by changes in gene expression. We recently cloned a cDNA for glycogen synthase ( gsn) from Neurospora crassa, and showed that gsn transcription decreased when cells were exposed to heat shock (shifted from 30 degrees C to 45 degrees C). In order to understand the mechanisms that control gsn expression, we isolated the gene, including its 5' and 3' flanking regions, from the genome of N. crassa. An ORF of approximately 2.4 kb was identified, which is interrupted by four small introns (II-V). Intron I (482 bp) is located in the 5'UTR region. Three putative Transcription Initiation Sites (TISs) were mapped, one of which lies downstream of a canonical TATA-box sequence (5'-TGTATAAA-3'). Analysis of the 5'-flanking region revealed the presence of putative transcription factor-binding sites, including Heat Shock Elements (HSEs) and STress Responsive Elements (STREs). The possible involvement of these motifs in the negative regulation of gsn transcription was investigated using Electrophoretic Mobility Shift Assays (EMSA) with nuclear extracts of N. crassa mycelium obtained before and after heat shock, and DNA fragments encompassing HSE and STRE elements from the 5'-flanking region. While elements within the promoter region are involved in transcription under heat shock, elements in the 5'UTR intron may participate in transcription during vegetative growth. The results thus suggest that N. crassa possesses trans -acting elements that interact with the 5'-flanking region to regulate gsn transcription during heat shock and vegetative growth.
Griffin, Bryan D; Nagy, Eva
Recombinant fowl adenoviruses (FAdVs) have been successfully used as veterinary vaccine vectors. However, insufficient definitions of the protein-coding and non-coding regions and an incomplete understanding of virus-host interactions limit the progress of next-generation vectors. FAdVs are known to cause several diseases of poultry. Certain isolates of species FAdV-C are the aetiological agent of inclusion body hepatitis/hydropericardium syndrome (IBH/HPS). In this study, we report the complete 45667 bp genome sequence of FAdV-4 of species FAdV-C. Assessment of the protein-coding potential of FAdV-4 was carried out with the Bio-Dictionary-based Gene Finder together with an evaluation of sequence conservation among species FAdV-A and FAdV-D. On this basis, 46 potentially protein-coding ORFs were identified. Of these, 33 and 13 ORFs were assigned high and low protein-coding potential, respectively. Homologues of the ancestral adenoviral genes were, with few exceptions, assigned high protein-coding potential. ORFs that were unique to the FAdVs were differentiated into high and low protein-coding potential groups. Notable putative genes with high protein-coding capacity included the previously unreported fiber 1, hypothetical 10.3K and hypothetical 10.5K genes. Transcript analysis revealed that several of the small ORFs less than 300 nt in length that were assigned low coding potential contributed to upstream ORFs (uORFs) in important mRNAs, including the ORF22 mRNA. Subsequent analysis of the previously reported transcripts of FAdV-1, FAdV-9, human adenovirus 2 and bovine adenovirus 3 identified widespread uORFs in AdV mRNAs that have the potential to act as important translational regulatory elements.
Kimberly D Spradling
Full Text Available The baboon is an invaluable model for the study of human health and disease, including many complex diseases of the kidney. Although scientists have made great progress in developing this animal as a model for numerous areas of biomedical research, genomic resources for the baboon, such as a quality annotated genome, are still lacking. To this end, we characterized the baboon kidney transcriptome using high-throughput cDNA sequencing (RNA-Seq to identify genes, gene variants, single nucleotide polymorphisms (SNPs, insertion-deletion polymorphisms (InDels, cellular functions, and key pathways in the baboon kidney to provide a genomic resource for the baboon. Analysis of our sequencing data revealed 45,499 high-confidence SNPs and 29,813 InDels comparing baboon cDNA sequences with the human hg18 reference assembly and identified 35,900 cDNAs in the baboon kidney, including 35,150 transcripts representing 15,369 genic genes that are novel for the baboon. Gene ontology analysis of our sequencing dataset also identified numerous biological functions and canonical pathways that were significant in the baboon kidney, including a large number of metabolic pathways that support known functions of the kidney. The results presented in this study catalogues the transcribed mRNAs, noncoding RNAs, and hypothetical proteins in the baboon kidney and establishes a genomic resource for scientists using the baboon as an experimental model.
Bryant Susan V
Full Text Available Abstract Background Microarray analysis and 454 cDNA sequencing were used to investigate a centuries-old problem in regenerative biology: the basis of nerve-dependent limb regeneration in salamanders. Innervated (NR and denervated (DL forelimbs of Mexican axolotls were amputated and transcripts were sampled after 0, 5, and 14 days of regeneration. Results Considerable similarity was observed between NR and DL transcriptional programs at 5 and 14 days post amputation (dpa. Genes with extracellular functions that are critical to wound healing were upregulated while muscle-specific genes were downregulated. Thus, many processes that are regulated during early limb regeneration do not depend upon nerve-derived factors. The majority of the transcriptional differences between NR and DL limbs were correlated with blastema formation; cell numbers increased in NR limbs after 5 dpa and this yielded distinct transcriptional signatures of cell proliferation in NR limbs at 14 dpa. These transcriptional signatures were not observed in DL limbs. Instead, gene expression changes within DL limbs suggest more diverse and protracted wound-healing responses. 454 cDNA sequencing complemented the microarray analysis by providing deeper sampling of transcriptional programs and associated biological processes. Assembly of new 454 cDNA sequences with existing expressed sequence tag (EST contigs from the Ambystoma EST database more than doubled (3935 to 9411 the number of non-redundant human-A. mexicanum orthologous sequences. Conclusion Many new candidate gene sequences were discovered for the first time and these will greatly enable future studies of wound healing, epigenetics, genome stability, and nerve-dependent blastema formation and outgrowth using the axolotl model.
Full Text Available Abstract Background An interesting field of research in genomics and proteomics is to compare the overlap between the transcriptome and the proteome. Recently, the tools to analyse gene and protein expression on a whole-genome scale have been improved, including the availability of the new generation sequencing instruments and high-throughput antibody-based methods to analyze the presence and localization of proteins. In this study, we used massive transcriptome sequencing (RNA-seq to investigate the transcriptome of a human osteosarcoma cell line and compared the expression levels with in situ protein data obtained in-situ from antibody-based immunohistochemistry (IHC and immunofluorescence microscopy (IF. Results A large-scale analysis based on 2749 genes was performed, corresponding to approximately 13% of the protein coding genes in the human genome. We found the presence of both RNA and proteins to a large fraction of the analyzed genes with 60% of the analyzed human genes detected by all three methods. Only 34 genes (1.2% were not detected on the transcriptional or protein level with any method. Our data suggest that the majority of the human genes are expressed at detectable transcript or protein levels in this cell line. Since the reliability of antibodies depends on possible cross-reactivity, we compared the RNA and protein data using antibodies with different reliability scores based on various criteria, including Western blot analysis. Gene products detected in all three platforms generally have good antibody validation scores, while those detected only by antibodies, but not by RNA sequencing, generally consist of more low-scoring antibodies. Conclusion This suggests that some antibodies are staining the cells in an unspecific manner, and that assessment of transcript presence by RNA-seq can provide guidance for validation of the corresponding antibodies.
Full Text Available The marine dinoflagellate Cochlodinium polykrikoides is responsible for harmful algal blooms in aquatic environments and has spread into the world’s oceans. As a microeukaryote, it seems to have distinct genomic characteristics, like gene structure and regulation. In the present study, we characterized heat shock protein (HSP 70/90 of C. polykrikoides and evaluated their transcriptional responses to environmental stresses. Both HSPs contained the conserved motif patterns, showing the highest homology with those of other dinoflagellates. Genomic analysis showed that the CpHSP70 had no intron but was encoded by tandem arrangement manner with separation of intergenic spacers. However, CpHSP90 had one intron in the coding genomic regions, and no intergenic region was found. Phylogenetic analyses of separate HSPs showed that CpHSP70 was closely related with the dinoflagellate Crypthecodinium cohnii and CpHSP90 with other Gymnodiniales in dinoflagellates. Gene expression analyses showed that both HSP genes were upregulated by the treatments of separate algicides CuSO4 and NaOCl; however, they displayed downregulation pattern with PCB treatment. The transcription of CpHSP90 and CpHSP70 showed similar expression patterns under the same toxicant treatment, suggesting that both genes might have cooperative functions for the toxicant induced gene regulation in the dinoflagellate.
Camara, Pablo G; Rosenbloom, Daniel I S; Emmett, Kevin J; Levine, Arnold J; Rabadan, Raul
Meiotic recombination is a fundamental evolutionary process driving diversity in eukaryotes. In mammals, recombination is known to occur preferentially at specific genomic regions. Using topological data analysis (TDA), a branch of applied topology that extracts global features from large data sets, we developed an efficient method for mapping recombination at fine scales. When compared to standard linkage-based methods, TDA can deal with a larger number of SNPs and genomes without incurring prohibitive computational costs. We applied TDA to 1,000 Genomes Project data and constructed high-resolution whole-genome recombination maps of seven human populations. Our analysis shows that recombination is generally under-represented within transcription start sites. However, the binding sites of specific transcription factors are enriched for sites of recombination. These include transcription factors that regulate the expression of meiosis- and gametogenesis-specific genes, cell cycle progression, and differentiation blockage. Additionally, our analysis identifies an enrichment for sites of recombination at repeat-derived loci matched by piwi-interacting RNAs.
Rizzi, Nicoletta; Denegri, Marco; Chiodi, Ilaria; Corioni, Margherita; Valgardsdottir, Rut; Cobianchi, Fabio; Riva, Silvano; Biamonti, Giuseppe
Heat shock triggers the assembly of nuclear stress bodies that contain heat shock factor 1 and a subset of RNA processing factors. These structures are formed on the pericentromeric heterochromatic regions of specific human chromosomes, among which chromosome 9. In this article we show that these heterochromatic domains are characterized by an epigenetic status typical of euchromatic regions. Similarly to transcriptionally competent portions of the genome, stress bodies are, in fact, enriched in acetylated histone H4. Acetylation peaks at 6 h of recovery from heat shock. Moreover, heterochromatin markers, such as HP1 and histone H3 methylated on lysine 9, are excluded from these nuclear districts. In addition, heat shock triggers the transient accumulation of RNA molecules, heterogeneous in size, containing the subclass of satellite III sequences found in the pericentromeric heterochromatin of chromosome 9. This is the first report of a transcriptional activation of a constitutive heterochromatic portion of the genome in response to stress stimuli. PMID:14617804
Chang, Han-Wen; Kulaeva, Olga I; Shaytan, Alexey K; Kibanov, Mikhail; Kuznedelov, Konstantin; Severinov, Konstantin V; Kirpichnikov, Mikhail P; Clark, David J; Studitsky, Vasily M
Maintenance of nucleosomal structure in the cell nuclei is essential for cell viability, regulation of gene expression and normal aging. Our previous data identified a key intermediate (a small intranucleosomal DNA loop, Ø-loop) that is likely required for nucleosome survival during transcription by RNA polymerase II (Pol II) through chromatin, and suggested that strong nucleosomal pausing guarantees efficient nucleosome survival. To evaluate these predictions, we analysed transcription through a nucleosome by different, structurally related RNA polymerases and mutant yeast Pol II having different histone-interacting surfaces that presumably stabilize the Ø-loop. The height of the nucleosomal barrier to transcription and efficiency of nucleosome survival correlate with the net negative charges of the histone-interacting surfaces. Molecular modeling and analysis of Pol II-nucleosome intermediates by DNase I footprinting suggest that efficient Ø-loop formation and nucleosome survival are mediated by electrostatic interactions between the largest subunit of Pol II and core histones.
Ma, Xiaodong; Ma, Jianchao; Fan, Di; Li, Chaofeng; Jiang, Yuanzhong; Luo, Keming
Higher plants have been shown to experience a juvenile vegetative phase, an adult vegetative phase, and a reproductive phase during its postembryonic development and distinct lateral organ morphologies have been observed at the different development stages. Populus euphratica, commonly known as a desert poplar, has developed heteromorphic leaves during its development. The TCP family genes encode a group of plant-specific transcription factors involved in several aspects of plant development. In particular, TCPs have been shown to influence leaf size and shape in many herbaceous plants. However, whether these functions are conserved in woody plants remains unknown. In the present study, we carried out genome-wide identification of TCP genes in P. euphratica and P. trichocarpa, and 33 and 36 genes encoding putative TCP proteins were found, respectively. Phylogenetic analysis of the poplar TCPs together with Arabidopsis TCPs indicated a biased expansion of the TCP gene family via segmental duplications. In addition, our results have also shown a correlation between different expression patterns of several P. euphratica TCP genes and leaf shape variations, indicating their involvement in the regulation of leaf shape development.
Li, Min; Xu, Xiaohua; Liu, Yilun
The conserved RECQ5 DNA helicase is a tumor suppressor in mammalian cells. Defects in RECQ5 lead to the accumulation of spontaneous DNA double-stranded breaks (DSBs) during replication, despite the fact that these cells are proficient in DSB repair by homologous recombination (HR). The reason for this is unknown. Here, we demonstrate that these DSBs are linked to RNA polymerase II (RNAPII)-dependent transcription. In human RECQ5-depleted cells, active RNAPII accumulates on chromatin, and DNA breaks are associated with an RNAPII-dependent transcribed locus. Hence, transcription inhibition eliminates both active RNAPII and spontaneous DSB formation. In addition, the regulatory effect of RECQ5 on transcription and its interaction with RNAPII are enhanced in S-phase cells, supporting a role for RECQ5 in preventing transcription-associated DSBs during replication. Finally, we show that the SET2-RPB1 interaction (SRI) domain of human RECQ5 is important for suppressing spontaneous DSBs and the p53-dependent transcription stress response caused by the stalling of active RNAPII on DNA. Thus, our studies provide novel insights into a mechanism by which RECQ5 regulates the transcription machinery via its dynamic interaction with RNAPII, thereby preventing genome instability.
Alison R Frand
Full Text Available Although the molting cycle is a hallmark of insects and nematodes, neither the endocrine control of molting via size, stage, and nutritional inputs nor the enzymatic mechanism for synthesis and release of the exoskeleton is well understood. Here, we identify endocrine and enzymatic regulators of molting in C. elegans through a genome-wide RNA-interference screen. Products of the 159 genes discovered include annotated transcription factors, secreted peptides, transmembrane proteins, and extracellular matrix enzymes essential for molting. Fusions between several genes and green fluorescent protein show a pulse of expression before each molt in epithelial cells that synthesize the exoskeleton, indicating that the corresponding proteins are made in the correct time and place to regulate molting. We show further that inactivation of particular genes abrogates expression of the green fluorescent protein reporter genes, revealing regulatory networks that might couple the expression of genes essential for molting to endocrine cues. Many molting genes are conserved in parasitic nematodes responsible for human disease, and thus represent attractive targets for pesticide and pharmaceutical development.
Parmar, Manoj B; Wright, Jonathan M
A whole-genome duplication (WGD) early in the teleost fish lineage makes fish ideal organisms to study the fate of duplicated genes and underlying evolutionary trajectories that have led to the retention of ohnologous gene duplicates in fish genomes. Here, we compare the genomic organization and tissue-specific transcription of the ohnologous fabp7 and fabp10 genes in medaka, three-spined stickleback, and spotted green pufferfish to the well-studied duplicated fabp7 and fabp10 genes of zebrafish. Teleost fabp7 and fabp10 genes contain four exons interrupted by three introns. Polypeptide sequences of Fabp7 and Fabp10 show the highest sequence identity and similarity with their orthologs from vertebrates. Orthology was evident as the ohnologous Fabp7 and Fabp10 polypeptides of teleost fishes each formed distinct clades and clustered together with their orthologs from other vertebrates in a phylogenetic tree. Furthermore, ohnologous teleost fabp7 and fabp10 genes exhibit conserved gene synteny with human FABP7 and chicken FABP10, respectively, which provides compelling evidence that the duplicated fabp7 and fabp10 genes of teleost fishes most likely arose from the well-documented WGD. The tissue-specific distribution of fabp7a, fabp7b, fabp10a, and fabp10b transcripts provides evidence of diverged spatial transcriptional regulation between ohnologous gene duplicates of fabp7 and fabp10 in teleost fishes.
Fabrício R Lopes
Full Text Available Plant genomes are massively invaded by transposable elements (TEs, many of which are located near host genes and can thus impact gene expression. In flowering plants, TE expression can be activated (de-repressed under certain stressful conditions, both biotic and abiotic, as well as by genome stress caused by hybridization. In this study, we examined the effects of these stress agents on TE expression in two diploid species of coffee, Coffea canephora and C. eugenioides, and their allotetraploid hybrid C. arabica. We also explored the relationship of TE repression mechanisms to host gene regulation via the effects of exonized TE sequences. Similar to what has been seen for other plants, overall TE expression levels are low in Coffea plant cultivars, consistent with the existence of effective TE repression mechanisms. TE expression patterns are highly dynamic across the species and conditions assayed here are unrelated to their classification at the level of TE class or family. In contrast to previous results, cell culture conditions per se do not lead to the de-repression of TE expression in C. arabica. Results obtained here indicate that differing plant drought stress levels relate strongly to TE repression mechanisms. TEs tend to be expressed at significantly higher levels in non-irrigated samples for the drought tolerant cultivars but in drought sensitive cultivars the opposite pattern was shown with irrigated samples showing significantly higher TE expression. Thus, TE genome repression mechanisms may be finely tuned to the ideal growth and/or regulatory conditions of the specific plant cultivars in which they are active. Analysis of TE expression levels in cell culture conditions underscored the importance of nonsense-mediated mRNA decay (NMD pathways in the repression of Coffea TEs. These same NMD mechanisms can also regulate plant host gene expression via the repression of genes that bear exonized TE sequences.
Tang, Xiaohu; Lucas, Joseph E; Chen, Julia Ling-Yu; LaMonte, Gregory; Wu, Jianli; Wang, Michael Changsheng; Koumenis, Constantinos; Chi, Jen-Tsan
Within solid tumor microenvironments, lactic acidosis, and hypoxia each have powerful effects on cancer pathophysiology. However, the influence that these processes exert on each other is unknown. Here, we report that a significant portion of the transcriptional response to hypoxia elicited in cancer cells is abolished by simultaneous exposure to lactic acidosis. In particular, lactic acidosis abolished stabilization of HIF-1α protein which occurs normally under hypoxic conditions. In contrast, lactic acidosis strongly synergized with hypoxia to activate the unfolded protein response (UPR) and an inflammatory response, displaying a strong similarity to ATF4-driven amino acid deprivation responses (AAR). In certain breast tumors and breast tumor cells examined, an integrative analysis of gene expression and array CGH data revealed DNA copy number alterations at the ATF4 locus, an important activator of the UPR/AAR pathway. In this setting, varying ATF4 levels influenced the survival of cells after exposure to hypoxia and lactic acidosis. Our findings reveal that the condition of lactic acidosis present in solid tumors inhibits canonical hypoxia responses and activates UPR and inflammation responses. Furthermore, these data suggest that ATF4 status may be a critical determinant of the ability of cancer cells to adapt to oxygen and acidity fluctuations in the tumor microenvironment, perhaps linking short-term transcriptional responses to long-term selection for copy number alterations in cancer cells.
Full Text Available Tumors act systemically to sustain cancer progression, affecting the physiological processes in the host and triggering responses in the blood circulating cells. In this study, we explored blood transcriptional patterns of patients with two subtypes of HER2 negative breast cancers, with different prognosis and therapeutic outcome. Peripheral blood samples from seven healthy female donors and 29 women with breast cancer including 14 triple-negative breast cancers and 15 hormone-dependent breast cancers were evaluated by microarray. We also evaluated the stroma in primary tumors. Transcriptional analysis revealed distinct molecular signatures in the blood of HER2− breast cancer patients according to ER/PR status. Our data showed the implication of immune signaling in both breast cancer subtypes with an enrichment of these processes in the blood of TNBC patients. We observed a significant alteration of “chemokine signaling,” “IL-8 signaling,” and “communication between innate and adaptive immune cells” pathways in the blood of TNBC patients correlated with an increased inflammation and necrosis in their primary tumors. Overall, our data indicate that the presence of triple-negative breast cancer is associated with an enrichment of altered systemic immune-related pathways, suggesting that immunotherapy could possibly be synergistic to the chemotherapy, to improve the clinical outcome of these patients.
Tang, Xiaohu; Lucas, Joseph E.; Chen, Julia Ling-Yu; LaMonte, Gregory; Wu, Jianli; Wang, Michael Changsheng; Koumenis, Constantinos; Chi, Jen-Tsan
Within solid tumor microenvironments, lactic acidosis and hypoxia each have powerful effects on cancer pathophysiology. However, the influence that these processes exert on each other is unknown. Here we report that a significant portion of the transcriptional response to hypoxia elicited in cancer cells is abolished by simultaneous exposure to lactic acidosis. In particular, lactic acidosis abolished stabilization of HIF-1α protein which occurs normally under hypoxic conditions. In contrast, lactic acidosis strongly synergized with hypoxia to activate the unfolded protein response (UPR) and an inflammatory response, displaying a strong similarity to ATF4-driven amino acid deprivation responses (AAR). In certain breast tumors and breast tumor cells examined, an integrative analysis of gene expression and array CGH data revealed DNA copy number alterations at the ATF4 locus, an important activator of the UPR/AAR pathway. In this setting, varying ATF4 levels influenced the survival of cells after exposure to hypoxia and lactic acidosis. Our findings reveal that the condition of lactic acidosis present in solid tumors inhibits canonical hypoxia responses and activates UPR and inflammation responses. Further, they suggest that ATF4 status may be a critical determinant of the ability of cancer cells to adapt to oxygen and acidity fluctuations in the tumor microenvironment, perhaps linking short-term transcriptional responses to long-term selection for copy number alterations in cancer cells. PMID:22135092
Andreev, Sergey; Eidelman, Yuri
Genome instability (GI) is thought to be an important step in cancer induction and progression. Radiation induced GI is usually defined as genome alterations in the progeny of irradiated cells. The aim of this report is to demonstrate an opportunity for integrative analysis of radiation induced GI on the basis of multiscale modelling. Integrative, systems level modelling is necessary to assess different pathways resulting in GI in which a variety of genetic and epigenetic processes are involved. The multilevel modelling includes the Monte Carlo based simulation of several key processes involved in GI: DNA double strand breaks (DSBs) generation in cells initially irradiated as well as in descendants of irradiated cells, damage transmission through mitosis. Taking the cell-cycle-dependent generation of DNA/chromosome breakage into account ensures an advantage in estimating the contribution of different DNA damage response pathways to GI, as to nonhomologous vs homologous recombination repair mechanisms, the role of DSBs at telomeres or interstitial chromosomal sites, etc. The preliminary estimates show that both telomeric and non-telomeric DSB interactions are involved in delayed effects of radiation although differentially for different cell types. The computational experiments provide the data on the wide spectrum of GI endpoints (dicentrics, micronuclei, nonclonal translocations, chromatid exchanges, chromosome fragments) similar to those obtained experimentally for various cell lines under various experimental conditions. The modelling based analysis of experimental data demonstrates that radiation induced GI may be viewed as processes of delayed DSB induction/interaction/transmission being a key for quantification of GI. On the other hand, this conclusion is not sufficient to understand GI as a whole because factors of DNA non-damaging origin can also induce GI. Additionally, new data on induced pluripotent stem cells reveal that GI is acquired in normal mature
Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia
Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Cho, Suhyung; Kim, Min-Sik; Jeong, Yujin; Lee, Bo-Rahm; Lee, Jung-Hyun; Kang, Sung Gyun; Cho, Byung-Kwan
In spite of their pivotal roles in transcriptional and post-transcriptional processes, the regulatory elements of archaeal genomes are not yet fully understood. Here, we determine the primary transcriptome of the H2-producing archaeon Thermococcus onnurineus NA1. We identified 1,082 purine-rich transcription initiation sites along with well-conserved TATA box, A-rich B recognition element (BRE), and promoter proximal element (PPE) motif in promoter regions, a high pyrimidine nucleotide content (T/C) at the −1 position, and Shine-Dalgarno (SD) motifs (GGDGRD) in 5′ untranslated regions (5′ UTRs). Along with differential transcript levels, 117 leaderless genes and 86 non-coding RNAs (ncRNAs) were identified, representing diverse cellular functions and potential regulatory functions under the different growth conditions. Interestingly, we observed low GC content in ncRNAs for RNA-based regulation via unstructured forms or interaction with other cellular components. Further comparative analysis of T. onnurineus upstream regulatory sequences with those of closely related archaeal genomes demonstrated that transcription of orthologous genes are initiated by highly conserved promoter sequences, however their upstream sequences for transcriptional and translational regulation are largely diverse. These results provide the genetic information of T. onnurineus for its future application in metabolic engineering. PMID:28216628
Chetal, Kashish; Janga, Sarath Chandra
Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons-codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets. Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism. Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.
Full Text Available Abstract Background Alternative polyadenylation sites within a gene can lead to alternative transcript variants. Although bioinformatic analysis has been conducted to detect polyadenylation sites using nucleic acid sequences (EST/mRNA in the public databases, one special type, single-block EST is much less emphasized. This bias leaves a large space to discover novel transcript variants. Results In the present study, we identified novel transcript variants in the human genome by detecting intronic polyadenylation sites. Poly(A/T-tailed ESTs were obtained from single-block ESTs and clustered into 10,844 groups standing for 5,670 genes. Most sites were not found in other alternative splicing databases. To verify that these sites are from expressed transcripts, we analyzed the supporting EST number of each site, blasted representative ESTs against known mRNA sequences, traced terminal sequences from cDNA clones, and compared with the data of Affymetrix tiling array. These analyses confirmed about 84% (9,118/10,844 of the novel alternative transcripts, especially, 33% (3,575/10,844 of the transcripts from 2,704 genes were taken as high-reliability. Additionally, RT-PCR confirmed 38% (10/26 of predicted novel transcript variants. Conclusion Our results provide evidence for novel transcript variants with intronic poly(A sites. The expression of these novel variants was confirmed with computational and experimental tools. Our data provide a genome-wide resource for identification of novel human transcript variants with intronic polyadenylation sites, and offer a new view into the mystery of the human transcriptome.
Full Text Available MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB. Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus.
Hou, Xiao-Jin; Li, Si-Bei; Liu, Sheng-Rui; Hu, Chun-Gen; Zhang, Jin-Zhi
MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB) family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB). Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus.
Snijders, Antoine Maria
Almost all human cancers as well as developmental abnormalities are characterized by the presence of genetic alterations, most of which target a gene or a particular genomic locus resulting in altered gene expression and ultimately an altered phenotype. Different types of genetic alterations include
Mourier, Tobias; Willerslev, Eske
BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic...
Ma, Alvin C; McNulty, Melissa S; Poshusta, Tanya L; Campbell, Jarryd M; Martínez-Gálvez, Gabriel; Argue, David P; Lee, Han B; Urban, Mark D; Bullard, Cassandra E; Blackburn, Patrick R; Man, Toni K; Clark, Karl J; Ekker, Stephen C
Transcription activator-like effectors (TALEs) are extremely effective, single-molecule DNA-targeting molecular cursors used for locus-specific genome science applications, including high-precision molecular medicine and other genome engineering applications. TALEs are used in genome engineering for locus-specific DNA editing and imaging, as artificial transcriptional activators and repressors, and for targeted epigenetic modification. TALEs as nucleases (TALENs) are effective editing tools and offer high binding specificity and fewer sequence constraints toward the targeted genome than other custom nuclease systems. One bottleneck of broader TALE use is reagent accessibility. For example, one commonly deployed method uses a multitube, 5-day assembly protocol. Here we describe FusX, a streamlined Golden Gate TALE assembly system that (1) is backward compatible with popular TALE backbones, (2) is functionalized as a single-tube 3-day TALE assembly process, (3) requires only commonly used basic molecular biology reagents, and (4) is cost-effective. More than 100 TALEN pairs have been successfully assembled using FusX, and 27 pairs were quantitatively tested in zebrafish, with each showing high somatic and germline activity. Furthermore, this assembly system is flexible and is compatible with standard molecular biology laboratory tools, but can be scaled with automated laboratory support. To demonstrate, we use a highly accessible and commercially available liquid-handling robot to rapidly and accurately assemble TALEs using the FusX TALE toolkit. Together, the FusX system accelerates TALE-based genomic science applications from basic science screening work for functional genomics testing and molecular medicine applications.
Kinyui Alice Lo
Full Text Available The g