WorldWideScience

Sample records for dna regulatory motifs

  1. MotifMark: Finding regulatory motifs in DNA sequences.

    Science.gov (United States)

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

    2017-07-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.

  2. LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

    Science.gov (United States)

    Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

    2014-02-17

    As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of

  3. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    Science.gov (United States)

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. I-motif DNA structures are formed in the nuclei of human cells

    Science.gov (United States)

    Zeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, Daniel

    2018-06-01

    Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.

  5. DMINDA: an integrated web server for DNA motif identification and analyses.

    Science.gov (United States)

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Science.gov (United States)

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  7. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Directory of Open Access Journals (Sweden)

    Fauteux François

    2009-10-01

    Full Text Available Abstract Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP gene promoters from three plant families, namely Brassicaceae (mustards, Fabaceae (legumes and Poaceae (grasses using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L. Heynh., soybean (Glycine max (L. Merr. and rice (Oryza sativa L. respectively. We have identified three conserved motifs (two RY-like and one ACGT-like in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination

  8. The limits of de novo DNA motif discovery.

    Directory of Open Access Journals (Sweden)

    David Simcha

    Full Text Available A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of

  9. Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells

    KAUST Repository

    Wong, Ka-Chun; Li, Yue; Peng, Chengbin

    2015-01-01

    Motivation: The protein-DNA interactions between transcription factors (TFs) and transcription factor binding sites (TFBSs, also known as DNA motifs) are critical activities in gene transcription. The identification of the DNA motifs is a vital task for downstream analysis. Unfortunately, the long-range coupling information between different DNA motifs is still lacking. To fill the void, as the first-of-its-kind study, we have identified the coupling DNA motif pairs on long-range chromatin interactions in human. Results: The coupling DNA motif pairs exhibit substantially higher DNase accessibility than the background sequences. Half of the DNA motifs involved are matched to the existing motif databases, although nearly all of them are enriched with at least one gene ontology term. Their motif instances are also found statistically enriched on the promoter and enhancer regions. Especially, we introduce a novel measurement called motif pairing multiplicity which is defined as the number of motifs that are paired with a given motif on chromatin interactions. Interestingly, we observe that motif pairing multiplicity is linked to several characteristics such as regulatory region type, motif sequence degeneracy, DNase accessibility and pairing genomic distance. Taken into account together, we believe the coupling DNA motif pairs identified in this study can shed lights on the gene transcription mechanism under long-range chromatin interactions. © The Author 2015. Published by Oxford University Press.

  10. Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells

    KAUST Repository

    Wong, Ka-Chun

    2015-09-27

    Motivation: The protein-DNA interactions between transcription factors (TFs) and transcription factor binding sites (TFBSs, also known as DNA motifs) are critical activities in gene transcription. The identification of the DNA motifs is a vital task for downstream analysis. Unfortunately, the long-range coupling information between different DNA motifs is still lacking. To fill the void, as the first-of-its-kind study, we have identified the coupling DNA motif pairs on long-range chromatin interactions in human. Results: The coupling DNA motif pairs exhibit substantially higher DNase accessibility than the background sequences. Half of the DNA motifs involved are matched to the existing motif databases, although nearly all of them are enriched with at least one gene ontology term. Their motif instances are also found statistically enriched on the promoter and enhancer regions. Especially, we introduce a novel measurement called motif pairing multiplicity which is defined as the number of motifs that are paired with a given motif on chromatin interactions. Interestingly, we observe that motif pairing multiplicity is linked to several characteristics such as regulatory region type, motif sequence degeneracy, DNase accessibility and pairing genomic distance. Taken into account together, we believe the coupling DNA motif pairs identified in this study can shed lights on the gene transcription mechanism under long-range chromatin interactions. © The Author 2015. Published by Oxford University Press.

  11. RMOD: a tool for regulatory motif detection in signaling network.

    Directory of Open Access Journals (Sweden)

    Jinki Kim

    Full Text Available Regulatory motifs are patterns of activation and inhibition that appear repeatedly in various signaling networks and that show specific regulatory properties. However, the network structures of regulatory motifs are highly diverse and complex, rendering their identification difficult. Here, we present a RMOD, a web-based system for the identification of regulatory motifs and their properties in signaling networks. RMOD finds various network structures of regulatory motifs by compressing the signaling network and detecting the compressed forms of regulatory motifs. To apply it into a large-scale signaling network, it adopts a new subgraph search algorithm using a novel data structure called path-tree, which is a tree structure composed of isomorphic graphs of query regulatory motifs. This algorithm was evaluated using various sizes of signaling networks generated from the integration of various human signaling pathways and it showed that the speed and scalability of this algorithm outperforms those of other algorithms. RMOD includes interactive analysis and auxiliary tools that make it possible to manipulate the whole processes from building signaling network and query regulatory motifs to analyzing regulatory motifs with graphical illustration and summarized descriptions. As a result, RMOD provides an integrated view of the regulatory motifs and mechanism underlying their regulatory motif activities within the signaling network. RMOD is freely accessible online at the following URL: http://pks.kaist.ac.kr/rmod.

  12. Positional bias of general and tissue-specific regulatory motifs in mouse gene promoters

    Directory of Open Access Journals (Sweden)

    Farré Domènec

    2007-12-01

    Full Text Available Abstract Background The arrangement of regulatory motifs in gene promoters, or promoter architecture, is the result of mutation and selection processes that have operated over many millions of years. In mammals, tissue-specific transcriptional regulation is related to the presence of specific protein-interacting DNA motifs in gene promoters. However, little is known about the relative location and spacing of these motifs. To fill this gap, we have performed a systematic search for motifs that show significant bias at specific promoter locations in a large collection of housekeeping and tissue-specific genes. Results We observe that promoters driving housekeeping gene expression are enriched in particular motifs with strong positional bias, such as YY1, which are of little relevance in promoters driving tissue-specific expression. We also identify a large number of motifs that show positional bias in genes expressed in a highly tissue-specific manner. They include well-known tissue-specific motifs, such as HNF1 and HNF4 motifs in liver, kidney and small intestine, or RFX motifs in testis, as well as many potentially novel regulatory motifs. Based on this analysis, we provide predictions for 559 tissue-specific motifs in mouse gene promoters. Conclusion The study shows that motif positional bias is an important feature of mammalian proximal promoters and that it affects both general and tissue-specific motifs. Motif positional constraints define very distinct promoter architectures depending on breadth of expression and type of tissue.

  13. iFORM: Incorporating Find Occurrence of Regulatory Motifs.

    Science.gov (United States)

    Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie

    2016-01-01

    Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.

  14. Discovery of a Regulatory Motif for Human Satellite DNA Transcription in Response to BATF2 Overexpression.

    Science.gov (United States)

    Bai, Xuejia; Huang, Wenqiu; Zhang, Chenguang; Niu, Jing; Ding, Wei

    2016-03-01

    One of the basic leucine zipper transcription factors, BATF2, has been found to suppress cancer growth and migration. However, little is known about the genes downstream of BATF2. HeLa cells were stably transfected with BATF2, then chromatin immunoprecipitation-sequencing was employed to identify the DNA motifs responsive to BATF2. Comprehensive bioinformatics analyses indicated that the most significant motif discovered as TTCCATT[CT]GATTCCATTC[AG]AT was primarily distributed among the chromosome centromere regions and mostly within human type II satellite DNA. Such motifs were able to prime the transcription of type II satellite DNA in a directional and asymmetrical manner. Consistently, satellite II transcription was up-regulated in BATF2-overexpressing cells. The present study provides insight into understanding the role of BATF2 in tumours and the importance of satellite DNA in the maintenance of genomic stability. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  15. Hybrid DNA i-motif: Aminoethylprolyl-PNA (pC5) enhance the stability of DNA (dC5) i-motif structure.

    Science.gov (United States)

    Gade, Chandrasekhar Reddy; Sharma, Nagendra K

    2017-12-15

    This report describes the synthesis of C-rich sequence, cytosine pentamer, of aep-PNA and its biophysical studies for the formation of hybrid DNA:aep-PNAi-motif structure with DNA cytosine pentamer (dC 5 ) under acidic pH conditions. Herein, the CD/UV/NMR/ESI-Mass studies strongly support the formation of stable hybrid DNA i-motif structure with aep-PNA even near acidic conditions. Hence aep-PNA C-rich sequence cytosine could be considered as potential DNA i-motif stabilizing agents in vivo conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. RegRNA: an integrated web server for identifying regulatory RNA motifs and elements

    OpenAIRE

    Huang, Hsi-Yuan; Chien, Chia-Hung; Jen, Kuan-Hua; Huang, Hsien-Da

    2006-01-01

    Numerous regulatory structural motifs have been identified as playing essential roles in transcriptional and post-transcriptional regulation of gene expression. RegRNA is an integrated web server for identifying the homologs of regulatory RNA motifs and elements against an input mRNA sequence. Both sequence homologs and structural homologs of regulatory RNA motifs can be recognized. The regulatory RNA motifs supported in RegRNA are categorized into several classes: (i) motifs in mRNA 5′-untra...

  17. Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

    Science.gov (United States)

    Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

    2018-04-01

    The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.

  18. MotifMark: Finding Regulatory Motifs in DNA Sequences

    OpenAIRE

    Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L.; Wang, May D.

    2017-01-01

    The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity be...

  19. Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest.

    Science.gov (United States)

    Wang, Xin; Lin, Peijie; Ho, Joshua W K

    2018-01-19

    It has been observed that many transcription factors (TFs) can bind to different genomic loci depending on the cell type in which a TF is expressed in, even though the individual TF usually binds to the same core motif in different cell types. How a TF can bind to the genome in such a highly cell-type specific manner, is a critical research question. One hypothesis is that a TF requires co-binding of different TFs in different cell types. If this is the case, it may be possible to observe different combinations of TF motifs - a motif grammar - located at the TF binding sites in different cell types. In this study, we develop a bioinformatics method to systematically identify DNA motifs in TF binding sites across multiple cell types based on published ChIP-seq data, and address two questions: (1) can we build a machine learning classifier to predict cell-type specificity based on motif combinations alone, and (2) can we extract meaningful cell-type specific motif grammars from this classifier model. We present a Random Forest (RF) based approach to build a multi-class classifier to predict the cell-type specificity of a TF binding site given its motif content. We applied this RF classifier to two published ChIP-seq datasets of TF (TCF7L2 and MAX) across multiple cell types. Using cross-validation, we show that motif combinations alone are indeed predictive of cell types. Furthermore, we present a rule mining approach to extract the most discriminatory rules in the RF classifier, thus allowing us to discover the underlying cell-type specific motif grammar. Our bioinformatics analysis supports the hypothesis that combinatorial TF motif patterns are cell-type specific.

  20. DNA motif elucidation using belief propagation.

    Science.gov (United States)

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-09-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.

  1. DNA motif elucidation using belief propagation

    KAUST Repository

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-01-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ?10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/?wkc/kmerHMM. 2013 The Author(s).

  2. DNA motif elucidation using belief propagation

    KAUST Repository

    Wong, Ka-Chun

    2013-06-29

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ?10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors\\' websites: e.g. http://www.cs.toronto.edu/?wkc/kmerHMM. 2013 The Author(s).

  3. Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network

    Directory of Open Access Journals (Sweden)

    Barabási Albert-László

    2004-01-01

    Full Text Available Abstract Background Transcriptional regulation of cellular functions is carried out through a complex network of interactions among transcription factors and the promoter regions of genes and operons regulated by them.To better understand the system-level function of such networks simplification of their architecture was previously achieved by identifying the motifs present in the network, which are small, overrepresented, topologically distinct regulatory interaction patterns (subgraphs. However, the interaction of such motifs with each other, and their form of integration into the full network has not been previously examined. Results By studying the transcriptional regulatory network of the bacterium, Escherichia coli, we demonstrate that the two previously identified motif types in the network (i.e., feed-forward loops and bi-fan motifs do not exist in isolation, but rather aggregate into homologous motif clusters that largely overlap with known biological functions. Moreover, these clusters further coalesce into a supercluster, thus establishing distinct topological hierarchies that show global statistical properties similar to the whole network. Targeted removal of motif links disintegrates the network into small, isolated clusters, while random disruptions of equal number of links do not cause such an effect. Conclusion Individual motifs aggregate into homologous motif clusters and a supercluster forming the backbone of the E. coli transcriptional regulatory network and play a central role in defining its global topological organization.

  4. DNA methylation requires a DNMT1 ubiquitin interacting motif (UIM) and histone ubiquitination.

    Science.gov (United States)

    Qin, Weihua; Wolf, Patricia; Liu, Nan; Link, Stephanie; Smets, Martha; La Mastra, Federica; Forné, Ignasi; Pichler, Garwin; Hörl, David; Fellinger, Karin; Spada, Fabio; Bonapace, Ian Marc; Imhof, Axel; Harz, Hartmann; Leonhardt, Heinrich

    2015-08-01

    DNMT1 is recruited by PCNA and UHRF1 to maintain DNA methylation after replication. UHRF1 recognizes hemimethylated DNA substrates via the SRA domain, but also repressive H3K9me3 histone marks with its TTD. With systematic mutagenesis and functional assays, we could show that chromatin binding further involved UHRF1 PHD binding to unmodified H3R2. These complementation assays clearly demonstrated that the ubiquitin ligase activity of the UHRF1 RING domain is required for maintenance DNA methylation. Mass spectrometry of UHRF1-deficient cells revealed H3K18 as a novel ubiquitination target of UHRF1 in mammalian cells. With bioinformatics and mutational analyses, we identified a ubiquitin interacting motif (UIM) in the N-terminal regulatory domain of DNMT1 that binds to ubiquitinated H3 tails and is essential for DNA methylation in vivo. H3 ubiquitination and subsequent DNA methylation required UHRF1 PHD binding to H3R2. These results show the manifold regulatory mechanisms controlling DNMT1 activity that require the reading and writing of epigenetic marks by UHRF1 and illustrate the multifaceted interplay between DNA and histone modifications. The identification and functional characterization of the DNMT1 UIM suggests a novel regulatory principle and we speculate that histone H2AK119 ubiquitination might also lead to UIM-dependent recruitment of DNMT1 and DNA methylation beyond classic maintenance.

  5. PISMA: A Visual Representation of Motif Distribution in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Rogelio Alcántara-Silva

    2017-03-01

    Full Text Available Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf .

  6. Poly(A) motif prediction using spectral latent features from human DNA sequences

    KAUST Repository

    Xie, Bo; Jankovic, Boris R.; Bajic, Vladimir B.; Song, Le; Gao, Xin

    2013-01-01

    Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA.Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge.Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance.We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ?30% fewer error predictions relative to the other

  7. Poly(A) motif prediction using spectral latent features from human DNA sequences

    KAUST Repository

    Xie, Bo

    2013-06-21

    Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA.Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge.Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance.We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ?30% fewer error predictions relative to the other

  8. qPMS7: a fast algorithm for finding (ℓ, d-motifs in DNA and protein sequences.

    Directory of Open Access Journals (Sweden)

    Hieu Dinh

    Full Text Available Detection of rare events happening in a set of DNA/protein sequences could lead to new biological discoveries. One kind of such rare events is the presence of patterns called motifs in DNA/protein sequences. Finding motifs is a challenging problem since the general version of motif search has been proven to be intractable. Motifs discovery is an important problem in biology. For example, it is useful in the detection of transcription factor binding sites and transcriptional regulatory elements that are very crucial in understanding gene function, human disease, drug design, etc. Many versions of the motif search problem have been proposed in the literature. One such is the (ℓ, d-motif search (or Planted Motif Search (PMS. A generalized version of the PMS problem, namely, Quorum Planted Motif Search (qPMS, is shown to accurately model motifs in real data. However, solving the qPMS problem is an extremely difficult task because a special case of it, the PMS Problem, is already NP-hard, which means that any algorithm solving it can be expected to take exponential time in the worse case scenario. In this paper, we propose a novel algorithm named qPMS7 that tackles the qPMS problem on real data as well as challenging instances. Experimental results show that our Algorithm qPMS7 is on an average 5 times faster than the state-of-art algorithm. The executable program of Algorithm qPMS7 is freely available on the web at http://pms.engr.uconn.edu/downloads/qPMS7.zip. Our online motif discovery tools that use Algorithm qPMS7 are freely available at http://pms.engr.uconn.edu or http://motifsearch.com.

  9. Motif enrichment tool.

    Science.gov (United States)

    Blatti, Charles; Sinha, Saurabh

    2014-07-01

    The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    Science.gov (United States)

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  11. A structural basis for the regulatory inactivation of DnaA.

    Science.gov (United States)

    Xu, Qingping; McMullan, Daniel; Abdubek, Polat; Astakhova, Tamara; Carlton, Dennis; Chen, Connie; Chiu, Hsiu-Ju; Clayton, Thomas; Das, Debanu; Deller, Marc C; Duan, Lian; Elsliger, Marc-Andre; Feuerhelm, Julie; Hale, Joanna; Han, Gye Won; Jaroszewski, Lukasz; Jin, Kevin K; Johnson, Hope A; Klock, Heath E; Knuth, Mark W; Kozbial, Piotr; Sri Krishna, S; Kumar, Abhinav; Marciano, David; Miller, Mitchell D; Morse, Andrew T; Nigoghossian, Edward; Nopakun, Amanda; Okach, Linda; Oommachen, Silvya; Paulsen, Jessica; Puckett, Christina; Reyes, Ron; Rife, Christopher L; Sefcovic, Natasha; Trame, Christine; van den Bedem, Henry; Weekes, Dana; Hodgson, Keith O; Wooley, John; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wilson, Ian A

    2009-01-16

    Regulatory inactivation of DnaA is dependent on Hda (homologous to DnaA), a protein homologous to the AAA+ (ATPases associated with diverse cellular activities) ATPase region of the replication initiator DnaA. When bound to the sliding clamp loaded onto duplex DNA, Hda can stimulate the transformation of active DnaA-ATP into inactive DnaA-ADP. The crystal structure of Hda from Shewanella amazonensis SB2B at 1.75 A resolution reveals that Hda resembles typical AAA+ ATPases. The arrangement of the two subdomains in Hda (residues 1-174 and 175-241) differs dramatically from that of DnaA. A CDP molecule anchors the Hda domains in a conformation that promotes dimer formation. The Hda dimer adopts a novel oligomeric assembly for AAA+ proteins in which the arginine finger, crucial for ATP hydrolysis, is fully exposed and available to hydrolyze DnaA-ATP through a typical AAA+ type of mechanism. The sliding clamp binding motifs at the N-terminus of each Hda monomer are partially buried and combine to form an antiparallel beta-sheet at the dimer interface. The inaccessibility of the clamp binding motifs in the CDP-bound structure of Hda suggests that conformational changes are required for Hda to form a functional complex with the clamp. Thus, the CDP-bound Hda dimer likely represents an inactive form of Hda.

  12. DistAMo: A web-based tool to characterize DNA-motif distribution on bacterial chromosomes

    Directory of Open Access Journals (Sweden)

    Patrick eSobetzko

    2016-03-01

    Full Text Available Short DNA motifs are involved in a multitude of functions such as for example chromosome segregation, DNA replication or mismatch repair. Distribution of such motifs is often not random and the specific chromosomal pattern relates to the respective motif function. Computational approaches which quantitatively assess such chromosomal motif patterns are necessary. Here we present a new computer tool DistAMo (Distribution Analysis of DNA Motifs. The algorithm uses codon redundancy to calculate the relative abundance of short DNA motifs from single genes to entire chromosomes. Comparative genomics analyses of the GATC-motif distribution in γ-proteobacterial genomes using DistAMo revealed that (i genes beside the replication origin are enriched in GATCs, (ii genome-wide GATC distribution follows a distinct pattern and (iii genes involved in DNA replication and repair are enriched in GATCs. These features are specific for bacterial chromosomes encoding a Dam methyltransferase. The new software is available as a stand-alone or as an easy-to-use web-based server version at http://www.computational.bio.uni-giessen.de/distamo.

  13. Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes

    Directory of Open Access Journals (Sweden)

    Kistler Corby

    2010-03-01

    Full Text Available Abstract Background Fusarium graminearum (Fg, a major fungal pathogen of cultivated cereals, is responsible for billions of dollars in agriculture losses. There is a growing interest in understanding the transcriptional regulation of this organism, especially the regulation of genes underlying its pathogenicity. The generation of whole genome sequence assemblies for Fg and three closely related Fusarium species provides a unique opportunity for such a study. Results Applying comparative genomics approaches, we developed a computational pipeline to systematically discover evolutionarily conserved regulatory motifs in the promoter, downstream and the intronic regions of Fg genes, based on the multiple alignments of sequenced Fusarium genomes. Using this method, we discovered 73 candidate regulatory motifs in the promoter regions. Nearly 30% of these motifs are highly enriched in promoter regions of Fg genes that are associated with a specific functional category. Through comparison to Saccharomyces cerevisiae (Sc and Schizosaccharomyces pombe (Sp, we observed conservation of transcription factors (TFs, their binding sites and the target genes regulated by these TFs related to pathways known to respond to stress conditions or phosphate metabolism. In addition, this study revealed 69 and 39 conserved motifs in the downstream regions and the intronic regions, respectively, of Fg genes. The top intronic motif is the splice donor site. For the downstream regions, we noticed an intriguing absence of the mammalian and Sc poly-adenylation signals among the list of conserved motifs. Conclusion This study provides the first comprehensive list of candidate regulatory motifs in Fg, and underscores the power of comparative genomics in revealing functional elements among related genomes. The conservation of regulatory pathways among the Fusarium genomes and the two yeast species reveals their functional significance, and provides new insights in their

  14. DNA regulatory motif selection based on support vector machine ...

    African Journals Online (AJOL)

    ... machine (SVM) and its application in microarray experiment of Kashin-Beck disease. ... speed and amount of the corresponding mRNA in gene replication process. ... and revealed that some motifs may be related to the immune reactions.

  15. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

    Science.gov (United States)

    Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

    2013-09-02

    In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome

  16. Identification of putative regulatory motifs in the upstream regions of co-expressed functional groups of genes in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Joshi NV

    2009-01-01

    Full Text Available Abstract Background Regulation of gene expression in Plasmodium falciparum (Pf remains poorly understood. While over half the genes are estimated to be regulated at the transcriptional level, few regulatory motifs and transcription regulators have been found. Results The study seeks to identify putative regulatory motifs in the upstream regions of 13 functional groups of genes expressed in the intraerythrocytic developmental cycle of Pf. Three motif-discovery programs were used for the purpose, and motifs were searched for only on the gene coding strand. Four motifs – the 'G-rich', the 'C-rich', the 'TGTG' and the 'CACA' motifs – were identified, and zero to all four of these occur in the 13 sets of upstream regions. The 'CACA motif' was absent in functional groups expressed during the ring to early trophozoite transition. For functional groups expressed in each transition, the motifs tended to be similar. Upstream motifs in some functional groups showed 'positional conservation' by occurring at similar positions relative to the translational start site (TLS; this increases their significance as regulatory motifs. In the ribonucleotide synthesis, mitochondrial, proteasome and organellar translation machinery genes, G-rich, C-rich, CACA and TGTG motifs, respectively, occur with striking positional conservation. In the organellar translation machinery group, G-rich motifs occur close to the TLS. The same motifs were sometimes identified for multiple functional groups; differences in location and abundance of the motifs appear to ensure different modes of action. Conclusion The identification of positionally conserved over-represented upstream motifs throws light on putative regulatory elements for transcription in Pf.

  17. Phylogeny based discovery of regulatory elements

    Directory of Open Access Journals (Sweden)

    Cohen Barak A

    2006-05-01

    Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.

  18. Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

    Science.gov (United States)

    Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

    2018-01-10

    Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing

  19. A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction.

    Science.gov (United States)

    Guo, Yuchun; Tian, Kevin; Zeng, Haoyang; Guo, Xiaoyun; Gifford, David Kenneth

    2018-04-13

    The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k -mer set memory (KSM), which consists of a set of aligned k -mers that are overrepresented at TF binding sites, and a new method called KMAC for de novo discovery of KSMs. We find that KSMs more accurately predict in vivo binding sites than position weight matrix (PWM) models and other more complex motif models across a large set of ChIP-seq experiments. Furthermore, KSMs outperform PWMs and more complex motif models in predicting in vitro binding sites. KMAC also identifies correct motifs in more experiments than five state-of-the-art motif discovery methods. In addition, KSM-derived features outperform both PWM and deep learning model derived sequence features in predicting differential regulatory activities of expression quantitative trait loci (eQTL) alleles. Finally, we have applied KMAC to 1600 ENCODE TF ChIP-seq data sets and created a public resource of KSM and PWM motifs. We expect that the KSM representation and KMAC method will be valuable in characterizing TF binding specificities and in interpreting the effects of noncoding genetic variations. © 2018 Guo et al.; Published by Cold Spring Harbor Laboratory Press.

  20. Accurate quantification of microRNA via single strand displacement reaction on DNA origami motif.

    Directory of Open Access Journals (Sweden)

    Jie Zhu

    Full Text Available DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs.

  1. Accurate Quantification of microRNA via Single Strand Displacement Reaction on DNA Origami Motif

    Science.gov (United States)

    Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can

    2013-01-01

    DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs. PMID:23990889

  2. Accurate quantification of microRNA via single strand displacement reaction on DNA origami motif.

    Science.gov (United States)

    Zhu, Jie; Feng, Xiaolu; Lou, Jingyu; Li, Weidong; Li, Sheng; Zhu, Hongxin; Yang, Lun; Zhang, Aiping; He, Lin; Li, Can

    2013-01-01

    DNA origami is an emerging technology that assembles hundreds of staple strands and one single-strand DNA into certain nanopattern. It has been widely used in various fields including detection of biological molecules such as DNA, RNA and proteins. MicroRNAs (miRNAs) play important roles in post-transcriptional gene repression as well as many other biological processes such as cell growth and differentiation. Alterations of miRNAs' expression contribute to many human diseases. However, it is still a challenge to quantitatively detect miRNAs by origami technology. In this study, we developed a novel approach based on streptavidin and quantum dots binding complex (STV-QDs) labeled single strand displacement reaction on DNA origami to quantitatively detect the concentration of miRNAs. We illustrated a linear relationship between the concentration of an exemplary miRNA as miRNA-133 and the STV-QDs hybridization efficiency; the results demonstrated that it is an accurate nano-scale miRNA quantifier motif. In addition, both symmetrical rectangular motif and asymmetrical China-map motif were tested. With significant linearity in both motifs, our experiments suggested that DNA Origami motif with arbitrary shape can be utilized in this method. Since this DNA origami-based method we developed owns the unique advantages of simple, time-and-material-saving, potentially multi-targets testing in one motif and relatively accurate for certain impurity samples as counted directly by atomic force microscopy rather than fluorescence signal detection, it may be widely used in quantification of miRNAs.

  3. Argo_CUDA: Exhaustive GPU based approach for motif discovery in large DNA datasets.

    Science.gov (United States)

    Vishnevsky, Oleg V; Bocharnikov, Andrey V; Kolchanov, Nikolay A

    2018-02-01

    The development of chromatin immunoprecipitation sequencing (ChIP-seq) technology has revolutionized the genetic analysis of the basic mechanisms underlying transcription regulation and led to accumulation of information about a huge amount of DNA sequences. There are a lot of web services which are currently available for de novo motif discovery in datasets containing information about DNA/protein binding. An enormous motif diversity makes their finding challenging. In order to avoid the difficulties, researchers use different stochastic approaches. Unfortunately, the efficiency of the motif discovery programs dramatically declines with the query set size increase. This leads to the fact that only a fraction of top "peak" ChIP-Seq segments can be analyzed or the area of analysis should be narrowed. Thus, the motif discovery in massive datasets remains a challenging issue. Argo_Compute Unified Device Architecture (CUDA) web service is designed to process the massive DNA data. It is a program for the detection of degenerate oligonucleotide motifs of fixed length written in 15-letter IUPAC code. Argo_CUDA is a full-exhaustive approach based on the high-performance GPU technologies. Compared with the existing motif discovery web services, Argo_CUDA shows good prediction quality on simulated sets. The analysis of ChIP-Seq sequences revealed the motifs which correspond to known transcription factor binding sites.

  4. Dragon polya spotter: Predictor of poly(A) motifs within human genomic DNA sequences

    KAUST Repository

    Kalkatawi, Manal M.

    2011-11-15

    Motivation: Recognition of poly(A) signals in mRNA is relatively straightforward due to the presence of easily recognizable polyadenylic acid tail. However, the task of identifying poly(A) motifs in the primary genomic DNA sequence that correspond to poly(A) signals in mRNA is a far more challenging problem. Recognition of poly(A) signals is important for better gene annotation and understanding of the gene regulation mechanisms. In this work, we present one such poly(A) motif prediction method based on properties of human genomic DNA sequence surrounding a poly(A) motif. These properties include thermodynamic, physico-chemical and statistical characteristics. For predictions, we developed Artificial Neural Network and Random Forest models. These models are trained to recognize 12 most common poly(A) motifs in human DNA. Our predictors are available as a free web-based tool accessible at http://cbrc.kaust.edu.sa/dps. Compared with other reported predictors, our models achieve higher sensitivity and specificity and furthermore provide a consistent level of accuracy for 12 poly(A) motif variants. The Author(s) 2011. Published by Oxford University Press. All rights reserved.

  5. A Comparison Study for DNA Motif Modeling on Protein Binding Microarray

    KAUST Repository

    Wong, Ka-Chun; Li, Yue; Peng, Chengbin; Wong, Hau-San

    2015-01-01

    Transcription Factor Binding Sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, Protein Binding Microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k=810). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build motif models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement using di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.

  6. A Comparison Study for DNA Motif Modeling on Protein Binding Microarray

    KAUST Repository

    Wong, Ka-Chun

    2015-06-11

    Transcription Factor Binding Sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, Protein Binding Microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k=810). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build motif models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement using di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.

  7. POWRS: position-sensitive motif discovery.

    Directory of Open Access Journals (Sweden)

    Ian W Davis

    Full Text Available Transcription factors and the short, often degenerate DNA sequences they recognize are central regulators of gene expression, but their regulatory code is challenging to dissect experimentally. Thus, computational approaches have long been used to identify putative regulatory elements from the patterns in promoter sequences. Here we present a new algorithm "POWRS" (POsition-sensitive WoRd Set for identifying regulatory sequence motifs, specifically developed to address two common shortcomings of existing algorithms. First, POWRS uses the position-specific enrichment of regulatory elements near transcription start sites to significantly increase sensitivity, while providing new information about the preferred localization of those elements. Second, POWRS forgoes position weight matrices for a discrete motif representation that appears more resistant to over-generalization. We apply this algorithm to discover sequences related to constitutive, high-level gene expression in the model plant Arabidopsis thaliana, and then experimentally validate the importance of those elements by systematically mutating two endogenous promoters and measuring the effect on gene expression levels. This provides a foundation for future efforts to rationally engineer gene expression in plants, a problem of great importance in developing biotech crop varieties.BSD-licensed Python code at http://grassrootsbio.com/papers/powrs/.

  8. DNA motif alignment by evolving a population of Markov chains.

    Science.gov (United States)

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  9. Spatiotemporal network motif reveals the biological traits of developmental gene regulatory networks in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Kim Man-Sun

    2012-05-01

    Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.

  10. An enhanced computational platform for investigating the roles of regulatory RNA and for identifying functional RNA motifs

    OpenAIRE

    Chang, Tzu-Hao; Huang, Hsi-Yuan; Hsu, Justin Bo-Kai; Weng, Shun-Long; Horng, Jorng-Tzong; Huang, Hsien-Da

    2013-01-01

    Background Functional RNA molecules participate in numerous biological processes, ranging from gene regulation to protein synthesis. Analysis of functional RNA motifs and elements in RNA sequences can obtain useful information for deciphering RNA regulatory mechanisms. Our previous work, RegRNA, is widely used in the identification of regulatory motifs, and this work extends it by incorporating more comprehensive and updated data sources and analytical approaches into a new platform. Methods ...

  11. The Verrucomicrobia LexA-binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

    Directory of Open Access Journals (Sweden)

    Ivan Erill

    2016-07-01

    Full Text Available The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.

  12. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

    Science.gov (United States)

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.

  13. Systematic comparison of the response properties of protein and RNA mediated gene regulatory motifs.

    Science.gov (United States)

    Iyengar, Bharat Ravi; Pillai, Beena; Venkatesh, K V; Gadgil, Chetan J

    2017-05-30

    We present a framework enabling the dissection of the effects of motif structure (feedback or feedforward), the nature of the controller (RNA or protein), and the regulation mode (transcriptional, post-transcriptional or translational) on the response to a step change in the input. We have used a common model framework for gene expression where both motif structures have an activating input and repressing regulator, with the same set of parameters, to enable a comparison of the responses. We studied the global sensitivity of the system properties, such as steady-state gain, overshoot, peak time, and peak duration, to parameters. We find that, in all motifs, overshoot correlated negatively whereas peak duration varied concavely with peak time. Differences in the other system properties were found to be mainly dependent on the nature of the controller rather than the motif structure. Protein mediated motifs showed a higher degree of adaptation i.e. a tendency to return to baseline levels; in particular, feedforward motifs exhibited perfect adaptation. RNA mediated motifs had a mild regulatory effect; they also exhibited a lower peaking tendency and mean overshoot. Protein mediated feedforward motifs showed higher overshoot and lower peak time compared to the corresponding feedback motifs.

  14. Large-scale discovery of promoter motifs in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Thomas A Down

    2007-01-01

    Full Text Available A key step in understanding gene regulation is to identify the repertoire of transcription factor binding motifs (TFBMs that form the building blocks of promoters and other regulatory elements. Identifying these experimentally is very laborious, and the number of TFBMs discovered remains relatively small, especially when compared with the hundreds of transcription factor genes predicted in metazoan genomes. We have used a recently developed statistical motif discovery approach, NestedMICA, to detect candidate TFBMs from a large set of Drosophila melanogaster promoter regions. Of the 120 motifs inferred in our initial analysis, 25 were statistically significant matches to previously reported motifs, while 87 appeared to be novel. Analysis of sequence conservation and motif positioning suggested that the great majority of these discovered motifs are predictive of functional elements in the genome. Many motifs showed associations with specific patterns of gene expression in the D. melanogaster embryo, and we were able to obtain confident annotation of expression patterns for 25 of our motifs, including eight of the novel motifs. The motifs are available through Tiffin, a new database of DNA sequence motifs. We have discovered many new motifs that are overrepresented in D. melanogaster promoter regions, and offer several independent lines of evidence that these are novel TFBMs. Our motif dictionary provides a solid foundation for further investigation of regulatory elements in Drosophila, and demonstrates techniques that should be applicable in other species. We suggest that further improvements in computational motif discovery should narrow the gap between the set of known motifs and the total number of transcription factors in metazoan genomes.

  15. Conserved XPB Core Structure and Motifs for DNA Unwinding:Implications for Pathway Selection of Transcription or ExcisionRepair

    Energy Technology Data Exchange (ETDEWEB)

    Fan, Li; Arval, Andrew S.; Cooper, Priscilla K.; Iwai, Shigenori; Hanaoka, Fumio; Tainer, John A.

    2005-04-01

    The human xeroderma pigmentosum group B (XPB) helicase is essential for transcription, nucleotide excision repair, and TFIIH functional assembly. Here, we determined crystal structures of an Archaeoglobus fulgidus XPB homolog (AfXPB) that characterize two RecA-like XPB helicase domains and discover a DNA damage recognition domain (DRD), a unique RED motif, a flexible thumb motif (ThM), and implied conformational changes within a conserved functional core. RED motif mutations dramatically reduce helicase activity, and the DRD and ThM, which flank the RED motif, appear structurally as well as functionally analogous to the MutS mismatch recognition and DNA polymerase thumb domains. Substrate specificity is altered by DNA damage, such that AfXPB unwinds dsDNA with 3' extensions, but not blunt-ended dsDNA, unless it contains a lesion, as shown for CPD or (6-4) photoproducts. Together, these results provide an unexpected mechanism of DNA unwinding with Implications for XPB damage verification in nucleotide excision repair.

  16. Molecular dynamics simulations of electrostatics and hydration distributions around RNA and DNA motifs

    Science.gov (United States)

    Marlowe, Ashley E.; Singh, Abhishek; Semichaevsky, Andrey V.; Yingling, Yaroslava G.

    2009-03-01

    Nucleic acid nanoparticles can self-assembly through the formation of complementary loop-loop interactions or stem-stem interactions. Presence and concentration of ions can significantly affect the self-assembly process and the stability of the nanostructure. In this presentation we use explicit molecular dynamics simulations to examine the variations in cationic distributions and hydration environment around DNA and RNA helices and loop-loop interactions. Our simulations show that the potassium and sodium ionic distributions are different around RNA and DNA motifs which could be indicative of ion mediated relative stability of loop-loop complexes. Moreover in RNA loop-loop motifs ions are consistently present and exchanged through a distinct electronegative channel. We will also show how we used the specific RNA loop-loop motif to design a RNA hexagonal nanoparticle.

  17. Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.

    Directory of Open Access Journals (Sweden)

    Arnoldo J Müller-Molina

    Full Text Available To know the map between transcription factors (TFs and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.

  18. Human telomeric DNA: G-quadruplex, i-motif and Watson–Crick double helix

    Science.gov (United States)

    Phan, Anh Tuân; Mergny, Jean-Louis

    2002-01-01

    Human telomeric DNA composed of (TTAGGG/CCCTAA)n repeats may form a classical Watson–Crick double helix. Each individual strand is also prone to quadruplex formation: the G-rich strand may adopt a G-quadruplex conformation involving G-quartets whereas the C-rich strand may fold into an i-motif based on intercalated C·C+ base pairs. Using an equimolar mixture of the telomeric oligonucleotides d[AGGG(TTAGGG)3] and d[(CCCTAA)3CCCT], we defined which structures existed and which would be the predominant species under a variety of experimental conditions. Under near-physiological conditions of pH, temperature and salt concentration, telomeric DNA was predominantly in a double-helix form. However, at lower pH values or higher temperatures, the G-quadruplex and/or the i-motif efficiently competed with the duplex. We also present kinetic and thermodynamic data for duplex association and for G-quadruplex/i-motif unfolding. PMID:12409451

  19. Rtt107/Esc4 binds silent chromatin and DNA repair proteins using different BRCT motifs

    Directory of Open Access Journals (Sweden)

    Jockusch Rebecca A

    2006-11-01

    Full Text Available Abstract Background By screening a plasmid library for proteins that could cause silencing when targeted to the HMR locus in Saccharomyces cerevisiae, we previously reported the identification of Rtt107/Esc4 based on its ability to establish silent chromatin. In this study we aimed to determine the mechanism of Rtt107/Esc4 targeted silencing and also learn more about its biological functions. Results Targeted silencing by Rtt107/Esc4 was dependent on the SIR genes, which encode obligatory structural and enzymatic components of yeast silent chromatin. Based on its sequence, Rtt107/Esc4 was predicted to contain six BRCT motifs. This motif, originally identified in the human breast tumor suppressor gene BRCA1, is a protein interaction domain. The targeted silencing activity of Rtt107/Esc4 resided within the C-terminal two BRCT motifs, and this region of the protein bound to Sir3 in two-hybrid tests. Deletion of RTT107/ESC4 caused sensitivity to the DNA damaging agent MMS as well as to hydroxyurea. A two-hybrid screen showed that the N-terminal BRCT motifs of Rtt107/Esc4 bound to Slx4, a protein previously shown to be involved in DNA repair and required for viability in a strain lacking the DNA helicase Sgs1. Like SLX genes, RTT107ESC4 interacted genetically with SGS1; esc4Δ sgs1Δ mutants were viable, but exhibited a slow-growth phenotype and also a synergistic DNA repair defect. Conclusion Rtt107/Esc4 binds to the silencing protein Sir3 and the DNA repair protein Slx4 via different BRCT motifs, thus providing a bridge linking silent chromatin to DNA repair enzymes.

  20. A single thiazole orange molecule forms an exciplex in a DNA i-motif.

    Science.gov (United States)

    Xu, Baochang; Wu, Xiangyang; Yeow, Edwin K L; Shao, Fangwei

    2014-06-18

    A fluorescent exciplex of thiazole orange (TO) is formed in a single-dye conjugated DNA i-motif. The exciplex fluorescence exhibits a large Stokes shift, high quantum yield, robust response to pH oscillation and little structural disturbance to the DNA quadruplex, which can be used to monitor the folding of high-order DNA structures.

  1. A novel human AP endonuclease with conserved zinc-finger-like motifs involved in DNA strand break responses

    OpenAIRE

    Kanno, Shin-ichiro; Kuzuoka, Hiroyuki; Sasao, Shigeru; Hong, Zehui; Lan, Li; Nakajima, Satoshi; Yasui, Akira

    2007-01-01

    DNA damage causes genome instability and cell death, but many of the cellular responses to DNA damage still remain elusive. We here report a human protein, PALF (PNK and APTX-like FHA protein), with an FHA (forkhead-associated) domain and novel zinc-finger-like CYR (cysteine–tyrosine–arginine) motifs that are involved in responses to DNA damage. We found that the CYR motif is widely distributed among DNA repair proteins of higher eukaryotes, and that PALF, as well as a Drosophila protein with...

  2. A novel human AP endonuclease with conserved zinc-finger-like motifs involved in DNA strand break responses

    Science.gov (United States)

    Kanno, Shin-ichiro; Kuzuoka, Hiroyuki; Sasao, Shigeru; Hong, Zehui; Lan, Li; Nakajima, Satoshi; Yasui, Akira

    2007-01-01

    DNA damage causes genome instability and cell death, but many of the cellular responses to DNA damage still remain elusive. We here report a human protein, PALF (PNK and APTX-like FHA protein), with an FHA (forkhead-associated) domain and novel zinc-finger-like CYR (cysteine–tyrosine–arginine) motifs that are involved in responses to DNA damage. We found that the CYR motif is widely distributed among DNA repair proteins of higher eukaryotes, and that PALF, as well as a Drosophila protein with tandem CYR motifs, has endo- and exonuclease activities against abasic site and other types of base damage. PALF accumulates rapidly at single-strand breaks in a poly(ADP-ribose) polymerase 1 (PARP1)-dependent manner in human cells. Indeed, PALF interacts directly with PARP1 and is required for its activation and for cellular resistance to methyl-methane sulfonate. PALF also interacts directly with KU86, LIGASEIV and phosphorylated XRCC4 proteins and possesses endo/exonuclease activity at protruding DNA ends. Various treatments that produce double-strand breaks induce formation of PALF foci, which fully coincide with γH2AX foci. Thus, PALF and the CYR motif may play important roles in DNA repair of higher eukaryotes. PMID:17396150

  3. Novel essential residues of Hda for interaction with DnaA in the regulatory inactivation of DnaA: unique roles for Hda AAA Box VI and VII motifs.

    Science.gov (United States)

    Nakamura, Kenta; Katayama, Tsutomu

    2010-04-01

    Escherichia coli ATP-DnaA initiates chromosomal replication. For preventing extra-initiations, a complex of ADP-Hda and the DNA-loaded replicase clamp promotes DnaA-ATP hydrolysis, yielding inactive ADP-DnaA. However, the Hda-DnaA interaction mode remains unclear except that the Hda Box VII Arg finger (Arg-153) and DnaA sensor II Arg-334 within each AAA(+) domain are crucial for the DnaA-ATP hydrolysis. Here, we demonstrate that direct and functional interaction of ADP-Hda with DnaA requires the Hda residues Ser-152, Phe-118 and Asn-122 as well as Hda Arg-153 and DnaA Arg-334. Structural analyses suggest intermolecular interactions between Hda Ser-152 and DnaA Arg-334 and between Hda Phe-118 and the DnaA Walker B motif region, in addition to an intramolecular interaction between Hda Asn-122 and Arg-153. These interactions likely sustain a specific association of ADP-Hda and DnaA, promoting DnaA-ATP hydrolysis. Consistently, ATP-DnaA and ADP-DnaA interact with the ADP-Hda-DNA-clamp complex with similar affinities. Hda Phe-118 and Asn-122 are contained in the Box VI region, and their hydrophobic and electrostatic features are basically conserved in the corresponding residues of other AAA(+) proteins, suggesting a conserved role for Box VI. These findings indicate novel interaction mechanisms for Hda-DnaA as well as a potentially fundamental mechanism in AAA(+) protein interactions.

  4. Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

    International Nuclear Information System (INIS)

    Westberg, Johan A.; Jiang, Ji; Andersson, Leif C.

    2011-01-01

    Highlights: → Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. → Central iron atom of heme and cysteine-114 of STC1 are essential for binding. → STC1 binds Fe 2+ and Fe 3+ heme. → STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys 114 as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H 2 O 2 induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.

  5. Improvement of the Immunogenicity of Porcine Circovirus Type 2 DNA Vaccine by Recombinant ORF2 Gene and CpG Motifs.

    Science.gov (United States)

    Li, Jun; Shi, Jian-Li; Wu, Xiao-Yan; Fu, Fang; Yu, Jiang; Yuan, Xiao-Yuan; Peng, Zhe; Cong, Xiao-Yan; Xu, Shao-Jian; Sun, Wen-Bo; Cheng, Kai-Hui; Du, Yi-Jun; Wu, Jia-Qiang; Wang, Jin-Bao; Huang, Bao-Hua

    2015-06-01

    Nowadays, adjuvant is still important for boosting immunity and improving resistance in animals. In order to boost the immunity of porcine circovirus type 2 (PCV2) DNA vaccine, CpG motifs were inserted. In this study, the dose-effect was studied, and the immunity of PCV2 DNA vaccines by recombinant open reading frame 2 (ORF2) gene and CpG motifs was evaluated. Three-week-old Changbai piglets were inoculated intramuscularly with 200 μg, 400 μg, and 800 μg DNA vaccines containing 14 and 18 CpG motifs, respectively. Average gain and rectum temperature were recorded everyday during the experiments. Blood was collected from the piglets after vaccination to detect the changes of specific antibodies, interleukin-2, and immune cells every week. Tissues were collected for histopathology and polymerase chain reaction. The results indicated that compared to those of the control piglets, all concentrations of two DNA vaccines could induce PCV2-specific antibodies. A cellular immunity test showed that PCV2-specific lymphocytes proliferated the number of TH, TC, and CD3+ positive T-cells raised in the blood of DNA vaccine immune groups. There was no distinct pathological damage and viremia occurring in pigs that were inoculated with DNA vaccines, but there was some minor pathological damage in the control group. The results demonstrated that CpG motifs as an adjuvant could boost the humoral and cellular immunity of pigs to PCV2, especially in terms of cellular immunity. Comparing two DNA vaccines that were constructed, the one containing 18 CpG motifs was more effective. This is the first report that CpG motifs as an adjuvant insert to the PCV2 DNA vaccine could boost immunity.

  6. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    Science.gov (United States)

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  7. 14-3-3 checkpoint regulatory proteins interact specifically with DNA repair protein human exonuclease 1 (hEXO1) via a semi-conserved motif

    DEFF Research Database (Denmark)

    Andersen, Sofie Dabros; Keijzers, Guido; Rampakakis, Emmanouil

    2012-01-01

    Human exonuclease 1 (hEXO1) acts directly in diverse DNA processing events, including replication, mismatch repair (MMR), and double strand break repair (DSBR), and it was also recently described to function as damage sensor and apoptosis inducer following DNA damage. In contrast, 14-3-3 proteins...... are specifically induced by replication inhibition leading to protein ubiquitination and degradation. We demonstrate direct and robust interaction between hEXO1 and six of the seven 14-3-3 isoforms in vitro, suggestive of a novel protein interaction network between DNA repair and cell cycle control. Binding...... and most likely a second unidentified binding motif. 14-3-3 associations do not appear to directly influence hEXO1 in vitro nuclease activity or in vitro DNA replication initiation. Moreover, specific phosphorylation variants, including hEXO1 S746A, are efficiently imported to the nucleus; to associate...

  8. The Q Motif Is Involved in DNA Binding but Not ATP Binding in ChlR1 Helicase.

    Directory of Open Access Journals (Sweden)

    Hao Ding

    Full Text Available Helicases are molecular motors that couple the energy of ATP hydrolysis to the unwinding of structured DNA or RNA and chromatin remodeling. The conversion of energy derived from ATP hydrolysis into unwinding and remodeling is coordinated by seven sequence motifs (I, Ia, II, III, IV, V, and VI. The Q motif, consisting of nine amino acids (GFXXPXPIQ with an invariant glutamine (Q residue, has been identified in some, but not all helicases. Compared to the seven well-recognized conserved helicase motifs, the role of the Q motif is less acknowledged. Mutations in the human ChlR1 (DDX11 gene are associated with a unique genetic disorder known as Warsaw Breakage Syndrome, which is characterized by cellular defects in genome maintenance. To examine the roles of the Q motif in ChlR1 helicase, we performed site directed mutagenesis of glutamine to alanine at residue 23 in the Q motif of ChlR1. ChlR1 recombinant protein was overexpressed and purified from HEK293T cells. ChlR1-Q23A mutant abolished the helicase activity of ChlR1 and displayed reduced DNA binding ability. The mutant showed impaired ATPase activity but normal ATP binding. A thermal shift assay revealed that ChlR1-Q23A has a melting point value similar to ChlR1-WT. Partial proteolysis mapping demonstrated that ChlR1-WT and Q23A have a similar globular structure, although some subtle conformational differences in these two proteins are evident. Finally, we found ChlR1 exists and functions as a monomer in solution, which is different from FANCJ, in which the Q motif is involved in protein dimerization. Taken together, our results suggest that the Q motif is involved in DNA binding but not ATP binding in ChlR1 helicase.

  9. Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

    Energy Technology Data Exchange (ETDEWEB)

    Westberg, Johan A., E-mail: johan.westberg@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Jiang, Ji, E-mail: ji.jiang@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Andersson, Leif C., E-mail: leif.andersson@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland)

    2011-06-03

    Highlights: {yields} Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. {yields} Central iron atom of heme and cysteine-114 of STC1 are essential for binding. {yields} STC1 binds Fe{sup 2+} and Fe{sup 3+} heme. {yields} STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys{sup 114} as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H{sub 2}O{sub 2} induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.

  10. Direct AUC optimization of regulatory motifs.

    Science.gov (United States)

    Zhu, Lin; Zhang, Hong-Bo; Huang, De-Shuang

    2017-07-15

    The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. We propose a novel algorithm called CDAUC for optimizing DML-learned motifs based on the area under the receiver-operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub-problem is a piece-wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high-throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities. CDAUC is available at: https://drive.google.com/drive/folders/0BxOW5MtIZbJjNFpCeHlBVWJHeW8 . dshuang@tongji.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  11. Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.

    Science.gov (United States)

    Zhao, Xiaoyan; Sze, Sing-Hoi

    2011-05-01

    One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.

  12. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    Directory of Open Access Journals (Sweden)

    Lynch Michael

    2010-05-01

    Full Text Available Abstract Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1 shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2 are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3 reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  13. Do motifs reflect evolved function?--No convergent evolution of genetic regulatory network subgraph topologies.

    Science.gov (United States)

    Knabe, Johannes F; Nehaniv, Chrystopher L; Schilstra, Maria J

    2008-01-01

    Methods that analyse the topological structure of networks have recently become quite popular. Whether motifs (subgraph patterns that occur more often than in randomized networks) have specific functions as elementary computational circuits has been cause for debate. As the question is difficult to resolve with currently available biological data, we approach the issue using networks that abstractly model natural genetic regulatory networks (GRNs) which are evolved to show dynamical behaviors. Specifically one group of networks was evolved to be capable of exhibiting two different behaviors ("differentiation") in contrast to a group with a single target behavior. In both groups we find motif distribution differences within the groups to be larger than differences between them, indicating that evolutionary niches (target functions) do not necessarily mold network structure uniquely. These results show that variability operators can have a stronger influence on network topologies than selection pressures, especially when many topologies can create similar dynamics. Moreover, analysis of motif functional relevance by lesioning did not suggest that motifs were of greater importance to the functioning of the network than arbitrary subgraph patterns. Only when drastically restricting network size, so that one motif corresponds to a whole functionally evolved network, was preference for particular connection patterns found. This suggests that in non-restricted, bigger networks, entanglement with the rest of the network hinders topological subgraph analysis.

  14. Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.

    Science.gov (United States)

    Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N

    2013-03-15

    The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter

  15. Dragon polya spotter: Predictor of poly(A) motifs within human genomic DNA sequences

    KAUST Repository

    Kalkatawi, Manal M.; Rangkuti, Farania; Schramm, Michael C.; Jankovic, Boris R.; Kamau, Allan; Chowdhary, Rajesh; Archer, John A.C.; Bajic, Vladimir B.

    2011-01-01

    . These models are trained to recognize 12 most common poly(A) motifs in human DNA. Our predictors are available as a free web-based tool accessible at http://cbrc.kaust.edu.sa/dps. Compared with other reported predictors, our models achieve higher sensitivity

  16. DnaA protein DNA-binding domain binds to Hda protein to promote inter-AAA+ domain interaction involved in regulatory inactivation of DnaA.

    Science.gov (United States)

    Keyamura, Kenji; Katayama, Tsutomu

    2011-08-19

    Chromosomal replication is initiated from the replication origin oriC in Escherichia coli by the active ATP-bound form of DnaA protein. The regulatory inactivation of DnaA (RIDA) system, a complex of the ADP-bound Hda and the DNA-loaded replicase clamp, represses extra initiations by facilitating DnaA-bound ATP hydrolysis, yielding the inactive ADP-bound form of DnaA. However, the mechanisms involved in promoting the DnaA-Hda interaction have not been determined except for the involvement of an interaction between the AAA+ domains of the two. This study revealed that DnaA Leu-422 and Pro-423 residues within DnaA domain IV, including a typical DNA-binding HTH motif, are specifically required for RIDA-dependent ATP hydrolysis in vitro and that these residues support efficient interaction with the DNA-loaded clamp·Hda complex and with Hda in vitro. Consistently, substitutions of these residues caused accumulation of ATP-bound DnaA in vivo and oriC-dependent inhibition of cell growth. Leu-422 plays a more important role in these activities than Pro-423. By contrast, neither of these residues is crucial for DNA replication from oriC, although they are highly conserved in DnaA orthologues. Structural analysis of a DnaA·Hda complex model suggested that these residues make contact with residues in the vicinity of the Hda AAA+ sensor I that participates in formation of a nucleotide-interacting surface. Together, the results show that functional DnaA-Hda interactions require a second interaction site within DnaA domain IV in addition to the AAA+ domain and suggest that these interactions are crucial for the formation of RIDA complexes that are active for DnaA-ATP hydrolysis.

  17. DnaA Protein DNA-binding Domain Binds to Hda Protein to Promote Inter-AAA+ Domain Interaction Involved in Regulatory Inactivation of DnaA*

    Science.gov (United States)

    Keyamura, Kenji; Katayama, Tsutomu

    2011-01-01

    Chromosomal replication is initiated from the replication origin oriC in Escherichia coli by the active ATP-bound form of DnaA protein. The regulatory inactivation of DnaA (RIDA) system, a complex of the ADP-bound Hda and the DNA-loaded replicase clamp, represses extra initiations by facilitating DnaA-bound ATP hydrolysis, yielding the inactive ADP-bound form of DnaA. However, the mechanisms involved in promoting the DnaA-Hda interaction have not been determined except for the involvement of an interaction between the AAA+ domains of the two. This study revealed that DnaA Leu-422 and Pro-423 residues within DnaA domain IV, including a typical DNA-binding HTH motif, are specifically required for RIDA-dependent ATP hydrolysis in vitro and that these residues support efficient interaction with the DNA-loaded clamp·Hda complex and with Hda in vitro. Consistently, substitutions of these residues caused accumulation of ATP-bound DnaA in vivo and oriC-dependent inhibition of cell growth. Leu-422 plays a more important role in these activities than Pro-423. By contrast, neither of these residues is crucial for DNA replication from oriC, although they are highly conserved in DnaA orthologues. Structural analysis of a DnaA·Hda complex model suggested that these residues make contact with residues in the vicinity of the Hda AAA+ sensor I that participates in formation of a nucleotide-interacting surface. Together, the results show that functional DnaA-Hda interactions require a second interaction site within DnaA domain IV in addition to the AAA+ domain and suggest that these interactions are crucial for the formation of RIDA complexes that are active for DnaA-ATP hydrolysis. PMID:21708944

  18. The KYxxL motif in Rad17 protein is essential for the interaction with the 9–1–1 complex

    Energy Technology Data Exchange (ETDEWEB)

    Fukumoto, Yasunori, E-mail: fukumoto@faculty.chiba-u.jp [Laboratory of Molecular Cell Biology, Graduate School of Pharmaceutical Sciences, Chiba University, Chiba 260-8675 (Japan); Ikeuchi, Masayoshi; Nakayama, Yuji [Department of Biochemistry & Molecular Biology, Kyoto Pharmaceutical University, Kyoto 607-8414 (Japan); Yamaguchi, Naoto, E-mail: nyama@faculty.chiba-u.jp [Laboratory of Molecular Cell Biology, Graduate School of Pharmaceutical Sciences, Chiba University, Chiba 260-8675 (Japan)

    2016-09-02

    ATR-dependent DNA damage checkpoint is the major DNA damage checkpoint against UV irradiation and DNA replication stress. The Rad17–RFC and Rad9–Rad1–Hus1 (9–1–1) complexes interact with each other to contribute to ATR signaling, however, the precise regulatory mechanism of the interaction has not been established. Here, we identified a conserved sequence motif, KYxxL, in the AAA+ domain of Rad17 protein, and demonstrated that this motif is essential for the interaction with the 9–1–1 complex. We also show that UV-induced Rad17 phosphorylation is increased in the Rad17 KYxxL mutants. These data indicate that the interaction with the 9–1–1 complex is not required for Rad17 protein to be an efficient substrate for the UV-induced phosphorylation. Our data also raise the possibility that the 9–1–1 complex plays a negative regulatory role in the Rad17 phosphorylation. We also show that the nucleotide-binding activity of Rad17 is required for its nuclear localization. - Highlights: • We have identified a conserved KYxxL motif in Rad17 protein. • The KYxxL motif is crucial for the interaction with the 9–1–1 complex. • The KYxxL motif is dispensable or inhibitory for UV-induced Rad17 phosphorylation. • Nucleotide binding of Rad17 is required for its nuclear localization.

  19. Tetrahelical structural family adopted by AGCGA-rich regulatory DNA regions

    Science.gov (United States)

    Kocman, Vojč; Plavec, Janez

    2017-05-01

    Here we describe AGCGA-quadruplexes, an unexpected addition to the well-known tetrahelical families, G-quadruplexes and i-motifs, that have been a focus of intense research due to their potential biological impact in G- and C-rich DNA regions, respectively. High-resolution structures determined by solution-state nuclear magnetic resonance (NMR) spectroscopy demonstrate that AGCGA-quadruplexes comprise four 5'-AGCGA-3' tracts and are stabilized by G-A and G-C base pairs forming GAGA- and GCGC-quartets, respectively. Residues in the core of the structure are connected with edge-type loops. Sequences of alternating 5'-AGCGA-3' and 5'-GGG-3' repeats could be expected to form G-quadruplexes, but are shown herein to form AGCGA-quadruplexes instead. Unique structural features of AGCGA-quadruplexes together with lower sensitivity to cation and pH variation imply their potential biological relevance in regulatory regions of genes responsible for basic cellular processes that are related to neurological disorders, cancer and abnormalities in bone and cartilage development.

  20. Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins.

    Science.gov (United States)

    Wang, Ying; Ding, Jun; Daniell, Henry; Hu, Haiyan; Li, Xiaoman

    2012-09-01

    Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic regions near chloroplast genes in seven plant species and in promoter sequences of nuclear genes in Arabidopsis and rice. We found that -35/-10 elements alone cannot explain the transcriptional regulation of chloroplast genes. We also concluded that there are unlikely motifs shared by intergenic sequences of most of chloroplast genes, indicating that these genes are regulated differently. Finally and surprisingly, we found five conserved motifs, each of which occurs in no more than six chloroplast intergenic sequences, are significantly shared by promoters of nuclear-genes encoding chloroplast proteins. By integrating information from gene function annotation, protein subcellular localization analyses, protein-protein interaction data, and gene expression data, we further showed support of the functionality of these conserved motifs. Our study implies the existence of unknown nuclear-encoded transcription factors that regulate both chloroplast genes and nuclear genes encoding chloroplast protein, which sheds light on the understanding of the transcriptional regulation of chloroplast genes.

  1. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    Directory of Open Access Journals (Sweden)

    Michael Allevato

    Full Text Available The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX bind Enhancer box (E-box DNA elements (CANNTG and have the greatest affinity for the canonical MYC E-box (CME CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87% of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  2. The RXL motif of the African cassava mosaic virus Rep protein is necessary for rereplication of yeast DNA and viral infection in plants

    Energy Technology Data Exchange (ETDEWEB)

    Hipp, Katharina; Rau, Peter; Schäfer, Benjamin [Institut für Biomaterialien und biomolekulare Systeme, Abteilung für Molekularbiologie und Virologie der Pflanzen, Universität Stuttgart, Pfaffenwaldring 57, D-70550 Stuttgart (Germany); Gronenborn, Bruno [Institut des Sciences du Végétal, CNRS, 91198 Gif-sur-Yvette (France); Jeske, Holger, E-mail: holger.jeske@bio.uni-stuttgart.de [Institut für Biomaterialien und biomolekulare Systeme, Abteilung für Molekularbiologie und Virologie der Pflanzen, Universität Stuttgart, Pfaffenwaldring 57, D-70550 Stuttgart (Germany)

    2014-08-15

    Geminiviruses, single-stranded DNA plant viruses, encode a replication-initiator protein (Rep) that is indispensable for virus replication. A potential cyclin interaction motif (RXL) in the sequence of African cassava mosaic virus Rep may be an alternative link to cell cycle controls to the known interaction with plant homologs of retinoblastoma protein (pRBR). Mutation of this motif abrogated rereplication in fission yeast induced by expression of wildtype Rep suggesting that Rep interacts via its RXL motif with one or several yeast proteins. The RXL motif is essential for viral infection of Nicotiana benthamiana plants, since mutation of this motif in infectious clones prevented any symptomatic infection. The cell-cycle link (Clink) protein of a nanovirus (faba bean necrotic yellows virus) was investigated that activates the cell cycle by binding via its LXCXE motif to pRBR. Expression of wildtype Clink and a Clink mutant deficient in pRBR-binding did not trigger rereplication in fission yeast. - Highlights: • A potential cyclin interaction motif is conserved in geminivirus Rep proteins. • In ACMV Rep, this motif (RXL) is essential for rereplication of fission yeast DNA. • Mutating RXL abrogated viral infection completely in Nicotiana benthamiana. • Expression of a nanovirus Clink protein in yeast did not induce rereplication. • Plant viruses may have evolved multiple routes to exploit host DNA synthesis.

  3. The RXL motif of the African cassava mosaic virus Rep protein is necessary for rereplication of yeast DNA and viral infection in plants

    International Nuclear Information System (INIS)

    Hipp, Katharina; Rau, Peter; Schäfer, Benjamin; Gronenborn, Bruno; Jeske, Holger

    2014-01-01

    Geminiviruses, single-stranded DNA plant viruses, encode a replication-initiator protein (Rep) that is indispensable for virus replication. A potential cyclin interaction motif (RXL) in the sequence of African cassava mosaic virus Rep may be an alternative link to cell cycle controls to the known interaction with plant homologs of retinoblastoma protein (pRBR). Mutation of this motif abrogated rereplication in fission yeast induced by expression of wildtype Rep suggesting that Rep interacts via its RXL motif with one or several yeast proteins. The RXL motif is essential for viral infection of Nicotiana benthamiana plants, since mutation of this motif in infectious clones prevented any symptomatic infection. The cell-cycle link (Clink) protein of a nanovirus (faba bean necrotic yellows virus) was investigated that activates the cell cycle by binding via its LXCXE motif to pRBR. Expression of wildtype Clink and a Clink mutant deficient in pRBR-binding did not trigger rereplication in fission yeast. - Highlights: • A potential cyclin interaction motif is conserved in geminivirus Rep proteins. • In ACMV Rep, this motif (RXL) is essential for rereplication of fission yeast DNA. • Mutating RXL abrogated viral infection completely in Nicotiana benthamiana. • Expression of a nanovirus Clink protein in yeast did not induce rereplication. • Plant viruses may have evolved multiple routes to exploit host DNA synthesis

  4. Selection against spurious promoter motifs correlates withtranslational efficiency across bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Froula, Jeffrey L.; Francino, M. Pilar

    2007-05-01

    Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the -10 promoter motifs that bind the {sigma}{sup 70} subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of -10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, -10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also implies that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.

  5. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    Science.gov (United States)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  6. Reversible Redox Activity by Ion-pH Dually Modulated Duplex Formation of i-Motif DNA with Complementary G-DNA

    Directory of Open Access Journals (Sweden)

    Soyoung Chang

    2018-04-01

    Full Text Available The unique biological features of supramolecular DNA have led to an increasing interest in biomedical applications such as biosensors. We have developed an i-motif and G-rich DNA conjugated single-walled carbon nanotube hybrid materials, which shows reversible conformational switching upon external stimuli such as pH (5 and 8 and presence of ions (Li+ and K+. We observed reversible electrochemical redox activity upon external stimuli in a quick and robust manner. Given the ease and the robustness of this method, we believe that pH- and ion-driven reversible DNA structure transformations will be utilized for future applications for developing novel biosensors.

  7. Solution structure of a DNA mimicking motif of an RNA aptamer against transcription factor AML1 Runt domain.

    Science.gov (United States)

    Nomura, Yusuke; Tanaka, Yoichiro; Fukunaga, Jun-ichi; Fujiwara, Kazuya; Chiba, Manabu; Iibuchi, Hiroaki; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Kozu, Tomoko; Sakamoto, Taiichi

    2013-12-01

    AML1/RUNX1 is an essential transcription factor involved in the differentiation of hematopoietic cells. AML1 binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. In a previous study, we obtained RNA aptamers against the AML1 Runt domain by systematic evolution of ligands by exponential enrichment and revealed that RNA aptamers exhibit higher affinity for the Runt domain than that for RDE and possess the 5'-GCGMGNN-3' and 5'-N'N'CCAC-3' conserved motif (M: A or C; N and N' form Watson-Crick base pairs) that is important for Runt domain binding. In this study, to understand the structural basis of recognition of the Runt domain by the aptamer motif, the solution structure of a 22-mer RNA was determined using nuclear magnetic resonance. The motif contains the AH(+)-C mismatch and base triple and adopts an unusual backbone structure. Structural analysis of the aptamer motif indicated that the aptamer binds to the Runt domain by mimicking the RDE sequence and structure. Our data should enhance the understanding of the structural basis of DNA mimicry by RNA molecules.

  8. MIRA: An R package for DNA methylation-based inference of regulatory activity.

    Science.gov (United States)

    Lawson, John T; Tomazou, Eleni M; Bock, Christoph; Sheffield, Nathan C

    2018-03-01

    DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for independent region sets with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for each region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of open chromatin and protein binding regions to be leveraged for novel insight into the regulatory state of DNA methylation datasets. R package available on Bioconductor: http://bioconductor.org/packages/release/bioc/html/MIRA.html. nsheffield@virginia.edu.

  9. DndEi Exhibits Helicase Activity Essential for DNA Phosphorothioate Modification and ATPase Activity Strongly Stimulated by DNA Substrate with a GAAC/GTTC Motif.

    Science.gov (United States)

    Zheng, Tao; Jiang, Pan; Cao, Bo; Cheng, Qiuxiang; Kong, Lingxin; Zheng, Xiaoqing; Hu, Qinghai; You, Delin

    2016-01-15

    Phosphorothioate (PT) modification of DNA, in which the non-bridging oxygen of the backbone phosphate group is replaced by sulfur, is governed by the DndA-E proteins in prokaryotes. To better understand the biochemical mechanism of PT modification, functional analysis of the recently found PT-modifying enzyme DndEi, which has an additional domain compared with canonical DndE, from Riemerella anatipestifer is performed in this study. The additional domain is identified as a DNA helicase, and functional deletion of this domain in vivo leads to PT modification deficiency, indicating an essential role of helicase activity in PT modification. Subsequent analysis reveals that the additional domain has an ATPase activity. Intriguingly, the ATPase activity is strongly stimulated by DNA substrate containing a GAAC/GTTC motif (i.e. the motif at which PT modifications occur in R. anatipestifer) when the additional domain and the other domain (homologous to canonical DndE) are co-expressed as a full-length DndEi. These results reveal that PT modification is a biochemical process with DNA strand separation and intense ATP hydrolysis. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  10. STUDYING THE INFLUENCE OF THE PYRENE INTERCALATOR TINA ON THE STABILITY OF DNA i-MOTIFS

    DEFF Research Database (Denmark)

    El-Sayed, Ahmed A.; Pedersen, Erik Bjerregaard; Khaireldin, Nahid A.

    2012-01-01

    Certain cytosine-rich (C-rich) DNA sequences can fold into secondary structures as four-stranded i-motifs with hemiprotonated base pairs. Here we synthesized C-rich TINA-intercalating oligonucleotides by inserting a nonnucleotide pyrene moiety between two C-rich regions. The stability of their i-...

  11. Using hexamers to predict cis-regulatory motifs in Drosophila

    Directory of Open Access Journals (Sweden)

    Kibler Dennis

    2005-10-01

    Full Text Available Abstract Background Cis-regulatory modules (CRMs are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered. Results We present a simple, efficient method (HexDiff based only on hexamer frequencies of known CRMs and non-CRM sequence to predict novel CRMs in regulatory systems. On a data set of 16 gap and pair-rule genes containing 52 known CRMs, predictions made by HexDiff had a higher correlation with the known CRMs than several existing CRM prediction algorithms: Ahab, Cluster Buster, MSCAN, MCAST, and LWF. After combining the results of the different algorithms, 10 putative CRMs were identified and are strong candidates for future study. The hexamers used by HexDiff to distinguish between CRMs and non-CRM sequence were also analyzed and were shown to be enriched in regulatory elements. Conclusion HexDiff provides an efficient and effective means for finding new CRMs based on known CRMs, rather than known binding sites.

  12. Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA.

    Science.gov (United States)

    Pierstorff, Nora; Bergman, Casey M; Wiehe, Thomas

    2006-12-01

    Predicting cis-regulatory modules (CRMs) in higher eukaryotes is a challenging computational task. Commonly used methods to predict CRMs based on the signal of transcription factor binding sites (TFBS) are limited by prior information about transcription factor specificity. More general methods that bypass the reliance on TFBS models are needed for comprehensive CRM prediction. We have developed a method to predict CRMs called CisPlusFinder that identifies high density regions of perfect local ungapped sequences (PLUSs) based on multiple species conservation. By assuming that PLUSs contain core TFBS motifs that are locally overrepresented, the method attempts to capture the expected features of CRM structure and evolution. Applied to a benchmark dataset of CRMs involved in early Drosophila development, CisPlusFinder predicts more annotated CRMs than all other methods tested. Using the REDfly database, we find that some 'false positive' predictions in the benchmark dataset correspond to recently annotated CRMs. Our work demonstrates that CRM prediction methods that combine comparative genomic data with statistical properties of DNA may achieve reasonable performance when applied genome-wide in the absence of an a priori set of known TFBS motifs. The program CisPlusFinder can be downloaded at http://jakob.genetik.uni-koeln.de/bioinformatik/people/nora/nora.html. All software is licensed under the Lesser GNU Public License (LGPL).

  13. Novel and deviant Walker A ATP-binding motifs in bacteriophage large terminase-DNA packaging proteins

    International Nuclear Information System (INIS)

    Mitchell, Michael S.; Rao, Venigalla B.

    2004-01-01

    Bacteriophage terminases constitute a very interesting class of viral-coded multifunctional ATPase 'motors' that apparently drive directional translocation of DNA into an empty viral capsid. A common Walker A motif and other conserved signatures of a critical ATPase catalytic center are identified in the N-terminal half of numerous large terminase proteins. However, several terminases, including the well-characterized λ and SPP1 terminases, seem to lack the classic Walker A in the N-terminus. Using sequence alignment approaches, we discovered the presence of deviant Walker A motifs in these and many other phage terminases. One deviation, the presence of a lysine at the beginning of P-loop, may represent a 3D equivalent of the universally conserved lysine in the Walker A GKT/S signature. This and other novel putative Walker A motifs that first came to light through this study help define the ATPase centers of phage and viral terminases as well as elicit important insights into the molecular functioning of this fundamental motif in biological systems

  14. Principal component analysis for predicting transcription-factor binding motifs from array-derived data

    Directory of Open Access Journals (Sweden)

    Vincenti Matthew P

    2005-11-01

    Full Text Available Abstract Background The responses to interleukin 1 (IL-1 in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs. In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. Results The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC-3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3' were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC-BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPARγ, STAF, ROAZ, and NFκB, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. Conclusion The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences.

  15. Design of character-based DNA barcode motif for species identification: A computational approach and its validation in fishes.

    Science.gov (United States)

    Chakraborty, Mohua; Dhar, Bishal; Ghosh, Sankar Kumar

    2017-11-01

    The DNA barcodes are generally interpreted using distance-based and character-based methods. The former uses clustering of comparable groups, based on the relative genetic distance, while the latter is based on the presence or absence of discrete nucleotide substitutions. The distance-based approach has a limitation in defining a universal species boundary across the taxa as the rate of mtDNA evolution is not constant throughout the taxa. However, character-based approach more accurately defines this using a unique set of nucleotide characters. The character-based analysis of full-length barcode has some inherent limitations, like sequencing of the full-length barcode, use of a sparse-data matrix and lack of a uniform diagnostic position for each group. A short continuous stretch of a fragment can be used to resolve the limitations. Here, we observe that a 154-bp fragment, from the transversion-rich domain of 1367 COI barcode sequences can successfully delimit species in the three most diverse orders of freshwater fishes. This fragment is used to design species-specific barcode motifs for 109 species by the character-based method, which successfully identifies the correct species using a pattern-matching program. The motifs also correctly identify geographically isolated population of the Cypriniformes species. Further, this region is validated as a species-specific mini-barcode for freshwater fishes by successful PCR amplification and sequencing of the motif (154 bp) using the designed primers. We anticipate that use of such motifs will enhance the diagnostic power of DNA barcode, and the mini-barcode approach will greatly benefit the field-based system of rapid species identification. © 2017 John Wiley & Sons Ltd.

  16. Genome-wide conserved consensus transcription factor binding motifs are hyper-methylated

    Directory of Open Access Journals (Sweden)

    Down Thomas A

    2010-09-01

    Full Text Available Abstract Background DNA methylation can regulate gene expression by modulating the interaction between DNA and proteins or protein complexes. Conserved consensus motifs exist across the human genome ("predicted transcription factor binding sites": "predicted TFBS" but the large majority of these are proven by chromatin immunoprecipitation and high throughput sequencing (ChIP-seq not to be biological transcription factor binding sites ("empirical TFBS". We hypothesize that DNA methylation at conserved consensus motifs prevents promiscuous or disorderly transcription factor binding. Results Using genome-wide methylation maps of the human heart and sperm, we found that all conserved consensus motifs as well as the subset of those that reside outside CpG islands have an aggregate profile of hyper-methylation. In contrast, empirical TFBS with conserved consensus motifs have a profile of hypo-methylation. 40% of empirical TFBS with conserved consensus motifs resided in CpG islands whereas only 7% of all conserved consensus motifs were in CpG islands. Finally we further identified a minority subset of TF whose profiles are either hypo-methylated or neutral at their respective conserved consensus motifs implicating that these TF may be responsible for establishing or maintaining an un-methylated DNA state, or whose binding is not regulated by DNA methylation. Conclusions Our analysis supports the hypothesis that at least for a subset of TF, empirical binding to conserved consensus motifs genome-wide may be controlled by DNA methylation.

  17. Brickworx builds recurrent RNA and DNA structural motifs into medium- and low-resolution electron-density maps

    Energy Technology Data Exchange (ETDEWEB)

    Chojnowski, Grzegorz, E-mail: gchojnowski@genesilico.pl [International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw (Poland); Waleń, Tomasz [International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw (Poland); University of Warsaw, Banacha 2, 02-097 Warsaw (Poland); Piątkowski, Paweł; Potrzebowski, Wojciech [International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw (Poland); Bujnicki, Janusz M. [International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw (Poland); Adam Mickiewicz University, Umultowska 89, 61-614 Poznan (Poland)

    2015-03-01

    A computer program that builds crystal structure models of nucleic acid molecules is presented. Brickworx is a computer program that builds crystal structure models of nucleic acid molecules using recurrent motifs including double-stranded helices. In a first step, the program searches for electron-density peaks that may correspond to phosphate groups; it may also take into account phosphate-group positions provided by the user. Subsequently, comparing the three-dimensional patterns of the P atoms with a database of nucleic acid fragments, it finds the matching positions of the double-stranded helical motifs (A-RNA or B-DNA) in the unit cell. If the target structure is RNA, the helical fragments are further extended with recurrent RNA motifs from a fragment library that contains single-stranded segments. Finally, the matched motifs are merged and refined in real space to find the most likely conformations, including a fit of the sequence to the electron-density map. The Brickworx program is available for download and as a web server at http://iimcb.genesilico.pl/brickworx.

  18. Brickworx builds recurrent RNA and DNA structural motifs into medium- and low-resolution electron-density maps

    International Nuclear Information System (INIS)

    Chojnowski, Grzegorz; Waleń, Tomasz; Piątkowski, Paweł; Potrzebowski, Wojciech; Bujnicki, Janusz M.

    2015-01-01

    A computer program that builds crystal structure models of nucleic acid molecules is presented. Brickworx is a computer program that builds crystal structure models of nucleic acid molecules using recurrent motifs including double-stranded helices. In a first step, the program searches for electron-density peaks that may correspond to phosphate groups; it may also take into account phosphate-group positions provided by the user. Subsequently, comparing the three-dimensional patterns of the P atoms with a database of nucleic acid fragments, it finds the matching positions of the double-stranded helical motifs (A-RNA or B-DNA) in the unit cell. If the target structure is RNA, the helical fragments are further extended with recurrent RNA motifs from a fragment library that contains single-stranded segments. Finally, the matched motifs are merged and refined in real space to find the most likely conformations, including a fit of the sequence to the electron-density map. The Brickworx program is available for download and as a web server at http://iimcb.genesilico.pl/brickworx

  19. A speedup technique for (l, d-motif finding algorithms

    Directory of Open Access Journals (Sweden)

    Dinh Hieu

    2011-03-01

    Full Text Available Abstract Background The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. Results Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS, (l, d-motif search (or Planted Motif Search (PMS, and Edit-distance-based Motif Search (EMS. In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. Conclusions We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very

  20. Barcoded DNA-tag reporters for multiplex cis-regulatory analysis.

    Directory of Open Access Journals (Sweden)

    Jongmin Nam

    Full Text Available Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of "barcoded" DNA-tag reporters, "Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs. The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics.

  1. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  2. The identification of functional motifs in temporal gene expression analysis

    Directory of Open Access Journals (Sweden)

    Michael G. Surette

    2005-01-01

    Full Text Available The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.

  3. Crystallization and preliminary X-ray diffraction analysis of motif N from Saccharomyces cerevisiae Dbf4

    International Nuclear Information System (INIS)

    Matthews, Lindsay A.; Duong, Andrew; Prasad, Ajai A.; Duncker, Bernard P.; Guarné, Alba

    2009-01-01

    To understand the role of the Cdc7–Dbf4 complex in checkpoint responses, a fragment of Saccharomyces cerevisiae Dbf4 encompassing motif N was isolated, overproduced and crystallized. The Cdc7–Dbf4 complex plays an instrumental role in the initiation of DNA replication and is a target of replication-checkpoint responses in Saccharomyces cerevisiae. Cdc7 is a conserved serine/threonine kinase whose activity depends on association with its regulatory subunit, Dbf4. A conserved sequence near the N-terminus of Dbf4 (motif N) is necessary for the interaction of Cdc7–Dbf4 with the checkpoint kinase Rad53. To understand the role of the Cdc7–Dbf4 complex in checkpoint responses, a fragment of Saccharomyces cerevisiae Dbf4 encompassing motif N was isolated, overproduced and crystallized. A complete native data set was collected at 100 K from crystals that diffracted X-rays to 2.75 Å resolution and structure determination is currently under way

  4. N-termini of fungal CSL transcription factors are disordered, enriched in regulatory motifs and inhibit DNA binding in fission yeast.

    Directory of Open Access Journals (Sweden)

    Martin Převorovský

    Full Text Available CSL (CBF1/RBP-Jκ/Suppressor of Hairless/LAG-1 transcription factors are the effector components of the Notch receptor signalling pathway, which is critical for metazoan development. The metazoan CSL proteins (class M can also function in a Notch-independent manner. Recently, two novel classes of CSL proteins, designated F1 and F2, have been identified in fungi. The role of the fungal CSL proteins is unclear, because the Notch pathway is not present in fungi. In fission yeast, the Cbf11 and Cbf12 CSL paralogs play antagonistic roles in cell adhesion and the coordination of cell and nuclear division. Unusually long N-terminal extensions are typical for fungal and invertebrate CSL family members. In this study, we investigate the functional significance of these extended N-termini of CSL proteins.We identify 15 novel CSL family members from 7 fungal species and conduct bioinformatic analyses of a combined dataset containing 34 fungal and 11 metazoan CSL protein sequences. We show that the long, non-conserved N-terminal tails of fungal CSL proteins are likely disordered and enriched in phosphorylation sites and PEST motifs. In a case study of Cbf12 (class F2, we provide experimental evidence that the protein is proteolytically processed and that the N-terminus inhibits the Cbf12-dependent DNA binding activity in an electrophoretic mobility shift assay.This study provides insight into the characteristics of the long N-terminal tails of fungal CSL proteins that may be crucial for controlling DNA-binding and CSL function. We propose that the regulation of DNA binding by Cbf12 via its N-terminal region represents an important means by which fission yeast strikes a balance between the class F1 and class F2 paralog activities. This mode of regulation might be shared with other CSL-positive fungi, some of which are relevant to human disease and biotechnology.

  5. Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

    Science.gov (United States)

    Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

    2012-01-01

    Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235

  6. Multiple regulatory systems coordinate DNA replication with cell growth in Bacillus subtilis.

    Science.gov (United States)

    Murray, Heath; Koh, Alan

    2014-10-01

    In many bacteria the rate of DNA replication is linked with cellular physiology to ensure that genome duplication is coordinated with growth. Nutrient-mediated growth rate control of DNA replication initiation has been appreciated for decades, however the mechanism(s) that connects these cell cycle activities has eluded understanding. In order to help address this fundamental question we have investigated regulation of DNA replication in the model organism Bacillus subtilis. Contrary to the prevailing view we find that changes in DnaA protein level are not sufficient to account for nutrient-mediated growth rate control of DNA replication initiation, although this regulation does require both DnaA and the endogenous replication origin. We go on to report connections between DNA replication and several essential cellular activities required for rapid bacterial growth, including respiration, central carbon metabolism, fatty acid synthesis, phospholipid synthesis, and protein synthesis. Unexpectedly, the results indicate that multiple regulatory systems are involved in coordinating DNA replication with cell physiology, with some of the regulatory systems targeting oriC while others act in a oriC-independent manner. We propose that distinct regulatory systems are utilized to control DNA replication in response to diverse physiological and chemical changes.

  7. Ancient mtDNA genetic variants modulate mtDNA transcription and replication.

    Directory of Open Access Journals (Sweden)

    Sarit Suissa

    2009-05-01

    Full Text Available Although the functional consequences of mitochondrial DNA (mtDNA genetic backgrounds (haplotypes, haplogroups have been demonstrated by both disease association studies and cell culture experiments, it is not clear which of the mutations within the haplogroup carry functional implications and which are "evolutionary silent hitchhikers". We set forth to study the functionality of haplogroup-defining mutations within the mtDNA transcription/replication regulatory region by in vitro transcription, hypothesizing that haplogroup-defining mutations occurring within regulatory motifs of mtDNA could affect these processes. We thus screened >2500 complete human mtDNAs representing all major populations worldwide for natural variation in experimentally established protein binding sites and regulatory regions comprising a total of 241 bp in each mtDNA. Our screen revealed 77/241 sites showing point mutations that could be divided into non-fixed (57/77, 74% and haplogroup/sub-haplogroup-defining changes (i.e., population fixed changes, 20/77, 26%. The variant defining Caucasian haplogroup J (C295T increased the binding of TFAM (Electro Mobility Shift Assay and the capacity of in vitro L-strand transcription, especially of a shorter transcript that maps immediately upstream of conserved sequence block 1 (CSB1, a region associated with RNA priming of mtDNA replication. Consistent with this finding, cybrids (i.e., cells sharing the same nuclear genetic background but differing in their mtDNA backgrounds harboring haplogroup J mtDNA had a >2 fold increase in mtDNA copy number, as compared to cybrids containing haplogroup H, with no apparent differences in steady state levels of mtDNA-encoded transcripts. Hence, a haplogroup J regulatory region mutation affects mtDNA replication or stability, which may partially account for the phenotypic impact of this haplogroup. Our analysis thus demonstrates, for the first time, the functional impact of particular mtDNA

  8. DNA mutation motifs in the genes associated with inherited diseases.

    Directory of Open Access Journals (Sweden)

    Michal Růžička

    Full Text Available Mutations in human genes can be responsible for inherited genetic disorders and cancer. Mutations can arise due to environmental factors or spontaneously. It has been shown that certain DNA sequences are more prone to mutate. These sites are termed hotspots and exhibit a higher mutation frequency than expected by chance. In contrast, DNA sequences with lower mutation frequencies than expected by chance are termed coldspots. Mutation hotspots are usually derived from a mutation spectrum, which reflects particular population where an effect of a common ancestor plays a role. To detect coldspots/hotspots unaffected by population bias, we analysed the presence of germline mutations obtained from HGMD database in the 5-nucleotide segments repeatedly occurring in genes associated with common inherited disorders, in particular, the PAH, LDLR, CFTR, F8, and F9 genes. Statistically significant sequences (mutational motifs rarely associated with mutations (coldspots and frequently associated with mutations (hotspots exhibited characteristic sequence patterns, e.g. coldspots contained purine tract while hotspots showed alternating purine-pyrimidine bases, often with the presence of CpG dinucleotide. Using molecular dynamics simulations and free energy calculations, we analysed the global bending properties of two selected coldspots and two hotspots with a G/T mismatch. We observed that the coldspots were inherently more flexible than the hotspots. We assume that this property might be critical for effective mismatch repair as DNA with a mutation recognized by MutSα protein is noticeably bent.

  9. Global MYCN transcription factor binding analysis in neuroblastoma reveals association with distinct E-box motifs and regions of DNA hypermethylation.

    LENUS (Irish Health Repository)

    Murphy, Derek M

    2009-01-01

    BACKGROUND: Neuroblastoma, a cancer derived from precursor cells of the sympathetic nervous system, is a major cause of childhood cancer related deaths. The single most important prognostic indicator of poor clinical outcome in this disease is genomic amplification of MYCN, a member of a family of oncogenic transcription factors. METHODOLOGY: We applied MYCN chromatin immunoprecipitation to microarrays (ChIP-chip) using MYCN amplified\\/non-amplified cell lines as well as a conditional knockdown cell line to determine the distribution of MYCN binding sites within all annotated promoter regions. CONCLUSION: Assessment of E-box usage within consistently positive MYCN binding sites revealed a predominance for the CATGTG motif (p<0.0016), with significant enrichment of additional motifs CATTTG, CATCTG, CAACTG in the MYCN amplified state. For cell lines over-expressing MYCN, gene ontology analysis revealed enrichment for the binding of MYCN at promoter regions of numerous molecular functional groups including DNA helicases and mRNA transcriptional regulation. In order to evaluate MYCN binding with respect to other genomic features, we determined the methylation status of all annotated CpG islands and promoter sequences using methylated DNA immunoprecipitation (MeDIP). The integration of MYCN ChIP-chip and MeDIP data revealed a highly significant positive correlation between MYCN binding and DNA hypermethylation. This association was also detected in regions of hemizygous loss, indicating that the observed association occurs on the same homologue. In summary, these findings suggest that MYCN binding occurs more commonly at CATGTG as opposed to the classic CACGTG E-box motif, and that disease associated over expression of MYCN leads to aberrant binding to additional weaker affinity E-box motifs in neuroblastoma. The co-localization of MYCN binding and DNA hypermethylation further supports the dual role of MYCN, namely that of a classical transcription factor affecting the

  10. Identification, occurrence, and validation of DRE and ABRE Cis-regulatory motifs in the promoter regions of genes of Arabidopsis thaliana.

    Science.gov (United States)

    Mishra, Sonal; Shukla, Aparna; Upadhyay, Swati; Sanchita; Sharma, Pooja; Singh, Seema; Phukan, Ujjal J; Meena, Abha; Khan, Feroz; Tripathi, Vineeta; Shukla, Rakesh Kumar; Shrama, Ashok

    2014-04-01

    Plants posses a complex co-regulatory network which helps them to elicit a response under diverse adverse conditions. We used an in silico approach to identify the genes with both DRE and ABRE motifs in their promoter regions in Arabidopsis thaliana. Our results showed that Arabidopsis contains a set of 2,052 genes with ABRE and DRE motifs in their promoter regions. Approximately 72% or more of the total predicted 2,052 genes had a gap distance of less than 400 bp between DRE and ABRE motifs. For positional orientation of the DRE and ABRE motifs, we found that the DR form (one in direct and the other one in reverse orientation) was more prevalent than other forms. These predicted 2,052 genes include 155 transcription factors. Using microarray data from The Arabidopsis Information Resource (TAIR) database, we present 44 transcription factors out of 155 which are upregulated by more than twofold in response to osmotic stress and ABA treatment. Fifty-one transcripts from the one predicted above were validated using semiquantitative expression analysis to support the microarray data in TAIR. Taken together, we report a set of genes containing both DRE and ABRE motifs in their promoter regions in A. thaliana, which can be useful to understand the role of ABA under osmotic stress condition. © 2013 Institute of Botany, Chinese Academy of Sciences.

  11. Biomimetic trapping cocktail to screen reactive metabolites: use of an amino acid and DNA motif mixture as light/heavy isotope pairs differing in mass shift.

    Science.gov (United States)

    Hosaka, Shuto; Honda, Takuto; Lee, Seon Hwa; Oe, Tomoyuki

    2018-06-01

    Candidate drugs that can be metabolically transformed into reactive electrophilic products, such as epoxides, quinones, and nitroso compounds, are of special concern because subsequent covalent binding to bio-macromolecules can cause adverse drug reactions, such as allergic reactions, hepatotoxicity, and genotoxicity. Several strategies have been reported for screening reactive metabolites, such as a covalent binding assay with radioisotope-labeled drugs and a trapping method followed by LC-MS/MS analyses. Of these, a trapping method using glutathione is the most common, especially at the early stage of drug development. However, the cysteine of glutathione is not the only nucleophilic site in vivo; lysine, histidine, arginine, and DNA bases are also nucleophilic. Indeed, the glutathione trapping method tends to overlook several types of reactive metabolites, such as aldehydes, acylglucuronides, and nitroso compounds. Here, we introduce an alternate way for screening reactive metabolites as follows: A mixture of the light and heavy isotopes of simplified amino acid motifs and a DNA motif is used as a biomimetic trapping cocktail. This mixture consists of [ 2 H 0 ]/[ 2 H 3 ]-1-methylguanidine (arginine motif, Δ 3 Da), [ 2 H 0 ]/[ 2 H 4 ]-2-mercaptoethanol (cysteine motif, Δ 4 Da), [ 2 H 0 ]/[ 2 H 5 ]-4-methylimidazole (histidine motif, Δ 5 Da), [ 2 H 0 ]/[ 2 H 9 ]-n-butylamine (lysine motif, Δ 9 Da), and [ 13 C 0 , 15 N 0 ]/[ 13 C 1 , 15 N 2 ]-2'-deoxyguanosine (DNA motif, Δ 3 Da). Mass tag triggered data-dependent acquisition is used to find the characteristic doublet peaks, followed by specific identification of the light isotope peak using MS/MS. Forty-two model drugs were examined using an in vitro microsome experiment to validate the strategy. Graphical abstract Biomimetic trapping cocktail to screen reactive metabolites.

  12. Probing structural changes of self assembled i-motif DNA

    KAUST Repository

    Lee, Iljoon; Patil, Sachin; Fhayli, Karim; Alsaiari, Shahad K.; Khashab, Niveen M.

    2015-01-01

    We report an i-motif structural probing system based on Thioflavin T (ThT) as a fluorescent sensor. This probe can discriminate the structural changes of RET and Rb i-motif sequences according to pH change. This journal is

  13. Quantification of Chemical and Mechanical Effects on the Formation of the G-Quadruplex and i-Motif in Duplex DNA.

    Science.gov (United States)

    Selvam, Sangeetha; Mandal, Shankar; Mao, Hanbin

    2017-09-05

    The formation of biologically significant tetraplex DNA species, such as G-quadruplexes and i-motifs, is affected by chemical (ions and pH) and mechanical [superhelicity (σ) and molecular crowding] factors. Because of the extremely challenging experimental conditions, the relative importance of these factors on tetraplex folding is unknown. In this work, we quantitatively evaluated the chemical and mechanical effects on the population dynamics of DNA tetraplexes in the insulin-linked polymorphic region using magneto-optical tweezers. By mechanically unfolding individual tetraplexes, we found that ions and pH have the largest effects on the formation of the G-quadruplex and i-motif, respectively. Interestingly, superhelicity has the second largest effect followed by molecular crowding conditions. While chemical effects are specific to tetraplex species, mechanical factors have generic influences. The predominant effect of chemical factors can be attributed to the fact that they directly change the stability of a specific tetraplex, whereas the mechanical factors, superhelicity in particular, reduce the stability of the competing species by changing the kinetics of the melting and annealing of the duplex DNA template in a nonspecific manner. The substantial dependence of tetraplexes on superhelicity provides strong support that DNA tetraplexes can serve as topological sensors to modulate fundamental cellular processes such as transcription.

  14. Fixing the model for transcription: the DNA moves, not the polymerase.

    Science.gov (United States)

    Papantonis, Argyris; Cook, Peter R

    2011-01-01

    The traditional model for transcription sees active polymerases tracking along their templates. An alternative (controversial) model has active enzymes immobilized in "factories." Recent evidence supports the idea that the DNA moves, not the polymerase, and points to alternative explanations of how regulatory motifs like enhancers and silencers work.

  15. Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.

    Science.gov (United States)

    Tsai, Zing Tsung-Yeh; Shiu, Shin-Han; Tsai, Huai-Kuang

    2015-08-01

    Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA "intrinsic properties" (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome.

  16. Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.

    Directory of Open Access Journals (Sweden)

    Zing Tsung-Yeh Tsai

    2015-08-01

    Full Text Available Transcription factor (TF binding is determined by the presence of specific sequence motifs (SM and chromatin accessibility, where the latter is influenced by both chromatin state (CS and DNA structure (DS properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA "intrinsic properties" (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome.

  17. Multiple regulatory systems coordinate DNA replication with cell growth in Bacillus subtilis.

    Directory of Open Access Journals (Sweden)

    Heath Murray

    2014-10-01

    Full Text Available In many bacteria the rate of DNA replication is linked with cellular physiology to ensure that genome duplication is coordinated with growth. Nutrient-mediated growth rate control of DNA replication initiation has been appreciated for decades, however the mechanism(s that connects these cell cycle activities has eluded understanding. In order to help address this fundamental question we have investigated regulation of DNA replication in the model organism Bacillus subtilis. Contrary to the prevailing view we find that changes in DnaA protein level are not sufficient to account for nutrient-mediated growth rate control of DNA replication initiation, although this regulation does require both DnaA and the endogenous replication origin. We go on to report connections between DNA replication and several essential cellular activities required for rapid bacterial growth, including respiration, central carbon metabolism, fatty acid synthesis, phospholipid synthesis, and protein synthesis. Unexpectedly, the results indicate that multiple regulatory systems are involved in coordinating DNA replication with cell physiology, with some of the regulatory systems targeting oriC while others act in a oriC-independent manner. We propose that distinct regulatory systems are utilized to control DNA replication in response to diverse physiological and chemical changes.

  18. Multiple Regulatory Systems Coordinate DNA Replication with Cell Growth in Bacillus subtilis

    Science.gov (United States)

    Murray, Heath; Koh, Alan

    2014-01-01

    In many bacteria the rate of DNA replication is linked with cellular physiology to ensure that genome duplication is coordinated with growth. Nutrient-mediated growth rate control of DNA replication initiation has been appreciated for decades, however the mechanism(s) that connects these cell cycle activities has eluded understanding. In order to help address this fundamental question we have investigated regulation of DNA replication in the model organism Bacillus subtilis. Contrary to the prevailing view we find that changes in DnaA protein level are not sufficient to account for nutrient-mediated growth rate control of DNA replication initiation, although this regulation does require both DnaA and the endogenous replication origin. We go on to report connections between DNA replication and several essential cellular activities required for rapid bacterial growth, including respiration, central carbon metabolism, fatty acid synthesis, phospholipid synthesis, and protein synthesis. Unexpectedly, the results indicate that multiple regulatory systems are involved in coordinating DNA replication with cell physiology, with some of the regulatory systems targeting oriC while others act in a oriC-independent manner. We propose that distinct regulatory systems are utilized to control DNA replication in response to diverse physiological and chemical changes. PMID:25340815

  19. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element.

    Science.gov (United States)

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-07-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5'-NNCCAC-3' and 5'-GCGMGN'N'-3' (M:A or C; N and N' form Watson-Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences.

  20. Regulation of TCF ETS-domain transcription factors by helix-loop-helix motifs.

    Science.gov (United States)

    Stinson, Julie; Inoue, Toshiaki; Yates, Paula; Clancy, Anne; Norton, John D; Sharrocks, Andrew D

    2003-08-15

    DNA binding by the ternary complex factor (TCF) subfamily of ETS-domain transcription factors is tightly regulated by intramolecular and intermolecular interactions. The helix-loop-helix (HLH)-containing Id proteins are trans-acting negative regulators of DNA binding by the TCFs. In the TCF, SAP-2/Net/ERP, intramolecular inhibition of DNA binding is promoted by the cis-acting NID region that also contains an HLH-like motif. The NID also acts as a transcriptional repression domain. Here, we have studied the role of HLH motifs in regulating DNA binding and transcription by the TCF protein SAP-1 and how Cdk-mediated phosphorylation affects the inhibitory activity of the Id proteins towards the TCFs. We demonstrate that the NID region of SAP-1 is an autoinhibitory motif that acts to inhibit DNA binding and also functions as a transcription repression domain. This region can be functionally replaced by fusion of Id proteins to SAP-1, whereby the Id moiety then acts to repress DNA binding in cis. Phosphorylation of the Ids by cyclin-Cdk complexes results in reduction in protein-protein interactions between the Ids and TCFs and relief of their DNA-binding inhibitory activity. In revealing distinct mechanisms through which HLH motifs modulate the activity of TCFs, our results therefore provide further insight into the role of HLH motifs in regulating TCF function and how the inhibitory properties of the trans-acting Id HLH proteins are themselves regulated by phosphorylation.

  1. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

    Directory of Open Access Journals (Sweden)

    Junhua Li

    2014-01-01

    Full Text Available Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes.

  2. Expression of 5 S rRNA genes linked to 35 S rDNA in plants, their epigenetic modification and regulatory element divergence

    Directory of Open Access Journals (Sweden)

    Garcia Sònia

    2012-06-01

    Full Text Available Abstract Background In plants, the 5 S rRNA genes usually occur as separate tandems (S-type arrangement or, less commonly, linked to 35 S rDNA units (L-type. The activity of linked genes remains unknown so far. We studied the homogeneity and expression of 5 S genes in several species from family Asteraceae known to contain linked 35 S-5 S units. Additionally, their methylation status was determined using bisulfite sequencing. Fluorescence in situ hybridization was applied to reveal the sub-nuclear positions of rDNA arrays. Results We found that homogenization of L-type units went to completion in most (4/6 but not all species. Two species contained major L-type and minor S-type units (termed Ls-type. The linked genes dominate 5 S rDNA expression while the separate tandems do not seem to be expressed. Members of tribe Anthemideae evolved functional variants of the polymerase III promoter in which a residing C-box element differs from the canonical angiosperm motif by as much as 30%. On this basis, a more relaxed consensus sequence of a plant C-box: (5’-RGSWTGGGTG-3’ is proposed. The 5 S paralogs display heavy DNA methylation similarly as to their unlinked counterparts. FISH revealed the close association of 35 S-5 S arrays with nucleolar periphery indicating that transcription of 5 S genes may occur in this territory. Conclusions We show that the unusual linked arrangement of 5 S genes, occurring in several plant species, is fully compatible with their expression and functionality. This extraordinary 5 S gene dynamics is manifested at different levels, such as variation in intrachromosomal positions, unit structure, epigenetic modification and considerable divergence of regulatory motifs.

  3. The adeno-associated virus major regulatory protein Rep78-c-Jun-DNA motif complex modulates AP-1 activity

    International Nuclear Information System (INIS)

    Prasad, C. Krishna; Meyers, Craig; Zhan Dejin; You Hong; Chiriva-Internati, Maurizio; Mehta, Jawahar L.; Liu Yong; Hermonat, Paul L.

    2003-01-01

    Multiple epidemiologic studies show that adeno-associated virus (AAV) is negatively associated with cervical cancer (CX CA), a cancer which is positively associated with human papillomavirus (HPV) infection. Mechanisms for this correlation may be by Rep78's (AAV's major regulatory protein) ability to bind the HPV-16 p97 promoter DNA and inhibit transcription, to bind and interfere with the functions of the E7 oncoprotein of HPV-16, and to bind a variety of HPV-important cellular transcription factors such as Sp1 and TBP. c-Jun is another important cellular factor intimately linked to the HPV life cycle, as well as keratinocyte differentiation and skin development. Skin is the natural host tissue for both HPV and AAV. In this article it is demonstrated that Rep78 directly interacts with c-Jun, both in vitro and in vivo, as analyzed by Western blot, yeast two-hybrid cDNA, and electrophoretic mobility shift-supershift assay (EMSA supershift). Addition of anti-Rep78 antibodies inhibited the EMSA supershift. Investigating the biological implications of this interaction, Rep78 inhibited the c-Jun-dependent c-jun promoter in transient and stable chloramphenicol acetyl-transferase (CAT) assays. Rep78 also inhibited c-Jun-augmented c-jun promoter as well as the HPV-16 p97 promoter activity (also c-Jun regulated) in in vitro transcription assays in T47D nuclear extracts. Finally, the Rep78-c-Jun interaction mapped to the amino-half of Rep78. The ability of Rep78 to interact with c-Jun and down-regulate AP-1-dependent transcription suggests one more mechanism by which AAV may modulate the HPV life cycle and the carcinogenesis process

  4. Computational analyses of synergism in small molecular network motifs.

    Directory of Open Access Journals (Sweden)

    Yili Zhang

    2014-03-01

    Full Text Available Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically to alter the responses of the motifs to stimuli. Synergism (or antagonism was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions.

  5. Conserved amino acid motifs from the novel Piv/MooV family of transposases and site-specific recombinases are required for catalysis of DNA inversion by Piv.

    Science.gov (United States)

    Tobiason, D M; Buchner, J M; Thiel, W H; Gernert, K M; Karls, A C

    2001-02-01

    Piv, a site-specific invertase from Moraxella lacunata, exhibits amino acid homology with the transposases of the IS110/IS492 family of insertion elements. The functions of conserved amino acid motifs that define this novel family of both transposases and site-specific recombinases (Piv/MooV family) were examined by mutagenesis of fully conserved amino acids within each motif in Piv. All Piv mutants altered in conserved residues were defective for in vivo inversion of the M. lacunata invertible DNA segment, but competent for in vivo binding to Piv DNA recognition sequences. Although the primary amino acid sequences of the Piv/MooV recombinases do not contain a conserved DDE motif, which defines the retroviral integrase/transposase (IN/Tnps) family, the predicted secondary structural elements of Piv align well with those of the IN/Tnps for which crystal structures have been determined. Molecular modelling of Piv based on these alignments predicts that E59, conserved as either E or D in the Piv/MooV family, forms a catalytic pocket with the conserved D9 and D101 residues. Analysis of Piv E59G confirms a role for E59 in catalysis of inversion. These results suggest that Piv and the related IS110/IS492 transposases mediate DNA recombination by a common mechanism involving a catalytic DED or DDD motif.

  6. Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints

    OpenAIRE

    Schwessinger, R; Suciu, MC; McGowan, SJ; Telenius, J; Taylor, S; Higgs, DR; Hughes, JR

    2017-01-01

    In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor bin...

  7. A 6-Nucleotide Regulatory Motif within the AbcR Small RNAs of Brucella abortus Mediates Host-Pathogen Interactions.

    Science.gov (United States)

    Sheehan, Lauren M; Caswell, Clayton C

    2017-06-06

    In Brucella abortus , two small RNAs (sRNAs), AbcR1 and AbcR2, are responsible for regulating transcripts encoding ABC-type transport systems. AbcR1 and AbcR2 are required for Brucella virulence, as a double chromosomal deletion of both sRNAs results in attenuation in mice. Although these sRNAs are responsible for targeting transcripts for degradation, the mechanism utilized by the AbcR sRNAs to regulate mRNA in Brucella has not been described. Here, two motifs (M1 and M2) were identified in AbcR1 and AbcR2, and complementary motif sequences were defined in AbcR-regulated transcripts. Site-directed mutagenesis of M1 or M2 or of both M1 and M2 in the sRNAs revealed transcripts to be targeted by one or both motifs. Electrophoretic mobility shift assays revealed direct, concentration-dependent binding of both AbcR sRNAs to a target mRNA sequence. These experiments genetically and biochemically characterized two indispensable motifs within the AbcR sRNAs that bind to and regulate transcripts. Additionally, cellular and animal models of infection demonstrated that only M2 in the AbcR sRNAs is required for Brucella virulence. Furthermore, one of the M2-regulated targets, BAB2_0612, was found to be critical for the virulence of B. abortus in a mouse model of infection. Although these sRNAs are highly conserved among Alphaproteobacteria , the present report displays how gene regulation mediated by the AbcR sRNAs has diverged to meet the intricate regulatory requirements of each particular organism and its unique biological niche. IMPORTANCE Small RNAs (sRNAs) are important components of bacterial regulation, allowing organisms to quickly adapt to changes in their environments. The AbcR sRNAs are highly conserved throughout the Alphaproteobacteria and negatively regulate myriad transcripts, many encoding ABC-type transport systems. In Brucella abortus , AbcR1 and AbcR2 are functionally redundant, as only a double abcR1 abcR2 ( abcR1 / 2 ) deletion results in attenuation in

  8. XcisClique: analysis of regulatory bicliques

    Directory of Open Access Journals (Sweden)

    Grene Ruth

    2006-04-01

    Full Text Available Abstract Background Modeling of cis-elements or regulatory motifs in promoter (upstream regions of genes is a challenging computational problem. In this work, set of regulatory motifs simultaneously present in the promoters of a set of genes is modeled as a biclique in a suitably defined bipartite graph. A biologically meaningful co-occurrence of multiple cis-elements in a gene promoter is assessed by the combined analysis of genomic and gene expression data. Greater statistical significance is associated with a set of genes that shares a common set of regulatory motifs, while simultaneously exhibiting highly correlated gene expression under given experimental conditions. Methods XcisClique, the system developed in this work, is a comprehensive infrastructure that associates annotated genome and gene expression data, models known cis-elements as regular expressions, identifies maximal bicliques in a bipartite gene-motif graph; and ranks bicliques based on their computed statistical significance. Significance is a function of the probability of occurrence of those motifs in a biclique (a hypergeometric distribution, and on the new sum of absolute values statistic (SAV that uses Spearman correlations of gene expression vectors. SAV is a statistic well-suited for this purpose as described in the discussion. Results XcisClique identifies new motif and gene combinations that might indicate as yet unidentified involvement of sets of genes in biological functions and processes. It currently supports Arabidopsis thaliana and can be adapted to other organisms, assuming the existence of annotated genomic sequences, suitable gene expression data, and identified regulatory motifs. A subset of Xcis Clique functionalities, including the motif visualization component MotifSee, source code, and supplementary material are available at https://bioinformatics.cs.vt.edu/xcisclique/.

  9. Characterization of noncoding regulatory DNA in the human genome.

    Science.gov (United States)

    Elkon, Ran; Agami, Reuven

    2017-08-08

    Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.

  10. Fragile DNA Motifs Trigger Mutagenesis at Distant Chromosomal Loci in Saccharomyces cerevisiae

    Science.gov (United States)

    Saini, Natalie; Zhang, Yu; Nishida, Yuri; Sheng, Ziwei; Choudhury, Shilpa; Mieczkowski, Piotr; Lobachev, Kirill S.

    2013-01-01

    DNA sequences capable of adopting non-canonical secondary structures have been associated with gross-chromosomal rearrangements in humans and model organisms. Previously, we have shown that long inverted repeats that form hairpin and cruciform structures and triplex-forming GAA/TTC repeats induce the formation of double-strand breaks which trigger genome instability in yeast. In this study, we demonstrate that breakage at both inverted repeats and GAA/TTC repeats is augmented by defects in DNA replication. Increased fragility is associated with increased mutation levels in the reporter genes located as far as 8 kb from both sides of the repeats. The increase in mutations was dependent on the presence of inverted or GAA/TTC repeats and activity of the translesion polymerase Polζ. Mutagenesis induced by inverted repeats also required Sae2 which opens hairpin-capped breaks and initiates end resection. The amount of breakage at the repeats is an important determinant of mutations as a perfect palindromic sequence with inherently increased fragility was also found to elevate mutation rates even in replication-proficient strains. We hypothesize that the underlying mechanism for mutagenesis induced by fragile motifs involves the formation of long single-stranded regions in the broken chromosome, invasion of the undamaged sister chromatid for repair, and faulty DNA synthesis employing Polζ. These data demonstrate that repeat-mediated breaks pose a dual threat to eukaryotic genome integrity by inducing chromosomal aberrations as well as mutations in flanking genes. PMID:23785298

  11. Distance-dependent duplex DNA destabilization proximal to G-quadruplex/i-motif sequences

    Science.gov (United States)

    König, Sebastian L. B.; Huppert, Julian L.; Sigel, Roland K. O.; Evans, Amanda C.

    2013-01-01

    G-quadruplexes and i-motifs are complementary examples of non-canonical nucleic acid substructure conformations. G-quadruplex thermodynamic stability has been extensively studied for a variety of base sequences, but the degree of duplex destabilization that adjacent quadruplex structure formation can cause has yet to be fully addressed. Stable in vivo formation of these alternative nucleic acid structures is likely to be highly dependent on whether sufficient spacing exists between neighbouring duplex- and quadruplex-/i-motif-forming regions to accommodate quadruplexes or i-motifs without disrupting duplex stability. Prediction of putative G-quadruplex-forming regions is likely to be assisted by further understanding of what distance (number of base pairs) is required for duplexes to remain stable as quadruplexes or i-motifs form. Using oligonucleotide constructs derived from precedented G-quadruplexes and i-motif-forming bcl-2 P1 promoter region, initial biophysical stability studies indicate that the formation of G-quadruplex and i-motif conformations do destabilize proximal duplex regions. The undermining effect that quadruplex formation can have on duplex stability is mitigated with increased distance from the duplex region: a spacing of five base pairs or more is sufficient to maintain duplex stability proximal to predicted quadruplex/i-motif-forming regions. PMID:23771141

  12. The Arabidopsis GAGA-Binding Factor BASIC PENTACYSTEINE6 Recruits the POLYCOMB-REPRESSIVE COMPLEX1 Component LIKE HETEROCHROMATIN PROTEIN1 to GAGA DNA Motifs.

    Science.gov (United States)

    Hecker, Andreas; Brand, Luise H; Peter, Sébastien; Simoncello, Nathalie; Kilian, Joachim; Harter, Klaus; Gaudin, Valérie; Wanke, Dierk

    2015-07-01

    Polycomb-repressive complexes (PRCs) play key roles in development by repressing a large number of genes involved in various functions. Much, however, remains to be discovered about PRC-silencing mechanisms as well as their targeting to specific genomic regions. Besides other mechanisms, GAGA-binding factors in animals can guide PRC members in a sequence-specific manner to Polycomb-responsive DNA elements. Here, we show that the Arabidopsis (Arabidopsis thaliana) GAGA-motif binding factor protein basic pentacysteine6 (BPC6) interacts with like heterochromatin protein1 (LHP1), a PRC1 component, and associates with vernalization2 (VRN2), a PRC2 component, in vivo. By using a modified DNA-protein interaction enzyme-linked immunosorbant assay, we could show that BPC6 was required and sufficient to recruit LHP1 to GAGA motif-containing DNA probes in vitro. We also found that LHP1 interacts with VRN2 and, therefore, can function as a possible scaffold between BPC6 and VRN2. The lhp1-4 bpc4 bpc6 triple mutant displayed a pleiotropic phenotype, extreme dwarfism and early flowering, which disclosed synergistic functions of LHP1 and group II plant BPC members. Transcriptome analyses supported this synergy and suggested a possible function in the concerted repression of homeotic genes, probably through histone H3 lysine-27 trimethylation. Hence, our findings suggest striking similarities between animal and plant GAGA-binding factors in the recruitment of PRC1 and PRC2 components to Polycomb-responsive DNA element-like GAGA motifs, which must have evolved through convergent evolution. © 2015 American Society of Plant Biologists. All Rights Reserved.

  13. Cations form sequence selective motifs within DNA grooves via a combination of cation-pi and ion-dipole/hydrogen bond interactions.

    Science.gov (United States)

    Stewart, Mikaela; Dunlap, Tori; Dourlain, Elizabeth; Grant, Bryce; McFail-Isom, Lori

    2013-01-01

    The fine conformational subtleties of DNA structure modulate many fundamental cellular processes including gene activation/repression, cellular division, and DNA repair. Most of these cellular processes rely on the conformational heterogeneity of specific DNA sequences. Factors including those structural characteristics inherent in the particular base sequence as well as those induced through interaction with solvent components combine to produce fine DNA structural variation including helical flexibility and conformation. Cation-pi interactions between solvent cations or their first hydration shell waters and the faces of DNA bases form sequence selectively and contribute to DNA structural heterogeneity. In this paper, we detect and characterize the binding patterns found in cation-pi interactions between solvent cations and DNA bases in a set of high resolution x-ray crystal structures. Specifically, we found that monovalent cations (Tl⁺) and the polarized first hydration shell waters of divalent cations (Mg²⁺, Ca²⁺) form cation-pi interactions with DNA bases stabilizing unstacked conformations. When these cation-pi interactions are combined with electrostatic interactions a pattern of specific binding motifs is formed within the grooves.

  14. Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva

    Directory of Open Access Journals (Sweden)

    Guo Xiang

    2008-12-01

    regulatory motifs in other species. These results suggest that these two motifs are likely to represent transcription factor binding sites in Theileria. Conclusion Theileria genomes are highly compact, with selection seemingly favoring short introns and intergenic regions. Three over-represented sequence motifs were independently identified in intergenic regions of both Theileria species, and the evidence suggests that at least two of them play a role in transcriptional control in T. parva. These are prime candidates for experimental validation of transcription factor binding sites in this single-celled eukaryotic parasite. Sequences similar to two of these Theileria motifs are conserved in Plasmodium hinting at the possibility of common regulatory machinery across the phylum Apicomplexa.

  15. Core regulatory network motif underlies the ocellar complex patterning in Drosophila melanogaster

    Science.gov (United States)

    Aguilar-Hidalgo, D.; Lemos, M. C.; Córdoba, A.

    2015-03-01

    During organogenesis, developmental programs governed by Gene Regulatory Networks (GRN) define the functionality, size and shape of the different constituents of living organisms. Robustness, thus, is an essential characteristic that GRNs need to fulfill in order to maintain viability and reproducibility in a species. In the present work we analyze the robustness of the patterning for the ocellar complex formation in Drosophila melanogaster fly. We have systematically pruned the GRN that drives the development of this visual system to obtain the minimum pathway able to satisfy this pattern. We found that the mechanism underlying the patterning obeys to the dynamics of a 3-nodes network motif with a double negative feedback loop fed by a morphogenetic gradient that triggers the inhibition in a French flag problem fashion. A Boolean modeling of the GRN confirms robustness in the patterning mechanism showing the same result for different network complexity levels. Interestingly, the network provides a steady state solution in the interocellar part of the patterning and an oscillatory regime in the ocelli. This theoretical result predicts that the ocellar pattern may underlie oscillatory dynamics in its genetic regulation.

  16. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.

    Science.gov (United States)

    Quang, Daniel; Xie, Xiaohui

    2016-06-20

    Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Nsite, NsiteH and NsiteM Computer Tools for Studying Tran-scription Regulatory Elements

    KAUST Repository

    Shahmuradov, Ilham

    2015-07-02

    Summary: Gene transcription is mostly conducted through interactions of various transcription factors and their binding sites on DNA (regulatory elements, REs). Today, we are still far from understanding the real regulatory content of promoter regions. Computer methods for identification of REs remain a widely used tool for studying and understanding transcriptional regulation mechanisms. The Nsite, NsiteH and NsiteM programs perform searches for statistically significant (non-random) motifs of known human, animal and plant one-box and composite REs in a single genomic sequence, in a pair of aligned homologous sequences and in a set of functionally related sequences, respectively.

  18. Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells.

    Directory of Open Access Journals (Sweden)

    Marica Grskovic

    2007-08-01

    Full Text Available Understanding the transcriptional regulation of pluripotent cells is of fundamental interest and will greatly inform efforts aimed at directing differentiation of embryonic stem (ES cells or reprogramming somatic cells. We first analyzed the transcriptional profiles of mouse ES cells and primordial germ cells and identified genes upregulated in pluripotent cells both in vitro and in vivo. These genes are enriched for roles in transcription, chromatin remodeling, cell cycle, and DNA repair. We developed a novel computational algorithm, CompMoby, which combines analyses of sequences both aligned and non-aligned between different genomes with a probabilistic segmentation model to systematically predict short DNA motifs that regulate gene expression. CompMoby was used to identify conserved overrepresented motifs in genes upregulated in pluripotent cells. We show that the motifs are preferentially active in undifferentiated mouse ES and embryonic germ cells in a sequence-specific manner, and that they can act as enhancers in the context of an endogenous promoter. Importantly, the activity of the motifs is conserved in human ES cells. We further show that the transcription factor NF-Y specifically binds to one of the motifs, is differentially expressed during ES cell differentiation, and is required for ES cell proliferation. This study provides novel insights into the transcriptional regulatory networks of pluripotent cells. Our results suggest that this systematic approach can be broadly applied to understanding transcriptional networks in mammalian species.

  19. Cell Type-Specific Chromatin Signatures Underline Regulatory DNA Elements in Human Induced Pluripotent Stem Cells and Somatic Cells.

    Science.gov (United States)

    Zhao, Ming-Tao; Shao, Ning-Yi; Hu, Shijun; Ma, Ning; Srinivasan, Rajini; Jahanbani, Fereshteh; Lee, Jaecheol; Zhang, Sophia L; Snyder, Michael P; Wu, Joseph C

    2017-11-10

    Regulatory DNA elements in the human genome play important roles in determining the transcriptional abundance and spatiotemporal gene expression during embryonic heart development and somatic cell reprogramming. It is not well known how chromatin marks in regulatory DNA elements are modulated to establish cell type-specific gene expression in the human heart. We aimed to decipher the cell type-specific epigenetic signatures in regulatory DNA elements and how they modulate heart-specific gene expression. We profiled genome-wide transcriptional activity and a variety of epigenetic marks in the regulatory DNA elements using massive RNA-seq (n=12) and ChIP-seq (chromatin immunoprecipitation combined with high-throughput sequencing; n=84) in human endothelial cells (CD31 + CD144 + ), cardiac progenitor cells (Sca-1 + ), fibroblasts (DDR2 + ), and their respective induced pluripotent stem cells. We uncovered 2 classes of regulatory DNA elements: class I was identified with ubiquitous enhancer (H3K4me1) and promoter (H3K4me3) marks in all cell types, whereas class II was enriched with H3K4me1 and H3K4me3 in a cell type-specific manner. Both class I and class II regulatory elements exhibited stimulatory roles in nearby gene expression in a given cell type. However, class I promoters displayed more dominant regulatory effects on transcriptional abundance regardless of distal enhancers. Transcription factor network analysis indicated that human induced pluripotent stem cells and somatic cells from the heart selected their preferential regulatory elements to maintain cell type-specific gene expression. In addition, we validated the function of these enhancer elements in transgenic mouse embryos and human cells and identified a few enhancers that could possibly regulate the cardiac-specific gene expression. Given that a large number of genetic variants associated with human diseases are located in regulatory DNA elements, our study provides valuable resources for deciphering

  20. RNA motif search with data-driven element ordering.

    Science.gov (United States)

    Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

    2016-05-18

    In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .

  1. Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

    Science.gov (United States)

    Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

    2015-06-01

    Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Regulatory motifs for CREB-binding protein and Nfe2l2 transcription factors in the upstream enhancer of the mitochondrial uncoupling protein 1 gene.

    Science.gov (United States)

    Rim, Jong S; Kozak, Leslie P

    2002-09-13

    Thermogenesis against cold exposure in mammals occurs in brown adipose tissue (BAT) through mitochondrial uncoupling protein (UCP1). Expression of the Ucp1 gene is unique in brown adipocytes and is regulated tightly. The 5'-flanking region of the mouse Ucp1 gene contains cis-acting elements including PPRE, TRE, and four half-site cAMP-responsive elements (CRE) with BAT-specific enhancer elements. In the course of analyzing how these half-site CREs are involved in Ucp1 expression, we found that a DNA regulatory element for NF-E2 overlaps CRE2. Electrophoretic mobility shift assay and competition assays with the CRE2 element indicates that nuclear proteins from BAT, inguinal fat, and retroperitoneal fat tissue interact with the CRE2 motif (CGTCA) in a specific manner. A supershift assay using an antibody against the CRE-binding protein (CREB) shows specific affinity to the complex from CRE2 and nuclear extract of BAT. Additionally, Western blot analysis for phospho-CREB/ATF1 shows an increase in phosphorylation of CREB/ATF1 in HIB-1B cells after norepinephrine treatment. Transient transfection assay using luciferase reporter constructs also indicates that the two half-site CREs are involved in transcriptional regulation of Ucp1 in response to norepinephrine and cAMP. We also show that a second DNA regulatory element for NF-E2 is located upstream of the CRE2 region. This element, which is found in a similar location in the 5'-flanking region of the human and rodent Ucp1 genes, shows specific binding to rat and human NF-E2 by electrophoretic mobility shift assay with nuclear extracts from brown fat. Co-transfections with an Nfe2l2 expression vector and a luciferase reporter construct of the Ucp1 enhancer region provide additional evidence that Nfe2l2 is involved in the regulation of Ucp1 by cAMP-mediated signaling.

  3. A feature-based approach to modeling protein-DNA interactions.

    Directory of Open Access Journals (Sweden)

    Eilon Sharon

    Full Text Available Transcription factor (TF binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM, which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs, a novel probabilistic method for modeling TF-DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/.

  4. CompariMotif: quick and easy comparisons of sequence motifs.

    Science.gov (United States)

    Edwards, Richard J; Davey, Norman E; Shields, Denis C

    2008-05-15

    CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/

  5. Mechanistically Distinct Pathways of Divergent Regulatory DNA Creation Contribute to Evolution of Human-Specific Genomic Regulatory Networks Driving Phenotypic Divergence of Homo sapiens.

    Science.gov (United States)

    Glinsky, Gennadi V

    2016-09-19

    Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of

  6. Efficient motif finding algorithms for large-alphabet inputs

    Directory of Open Access Journals (Sweden)

    Pavlovic Vladimir

    2010-10-01

    Full Text Available Abstract Background We consider the problem of identifying motifs, recurring or conserved patterns, in the biological sequence data sets. To solve this task, we present a new deterministic algorithm for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. Results The proposed algorithm (1 improves search efficiency compared to existing algorithms, and (2 scales well with the size of alphabet. On a synthetic planted DNA motif finding problem our algorithm is over 10× more efficient than MITRA, PMSPrune, and RISOTTO for long motifs. Improvements are orders of magnitude higher in the same setting with large alphabets. On benchmark TF-binding site problems (FNP, CRP, LexA we observed reduction in running time of over 12×, with high detection accuracy. The algorithm was also successful in rapidly identifying protein motifs in Lipocalin, Zinc metallopeptidase, and supersecondary structure motifs for Cadherin and Immunoglobin families. Conclusions Our algorithm reduces computational complexity of the current motif finding algorithms and demonstrate strong running time improvements over existing exact algorithms, especially in important and difficult cases of large-alphabet sequences.

  7. Native characterization of nucleic acid motif thermodynamics via non-covalent catalysis

    Science.gov (United States)

    Wang, Chunyan; Bae, Jin H.; Zhang, David Yu

    2016-01-01

    DNA hybridization thermodynamics is critical for accurate design of oligonucleotides for biotechnology and nanotechnology applications, but parameters currently in use are inaccurately extrapolated based on limited quantitative understanding of thermal behaviours. Here, we present a method to measure the ΔG° of DNA motifs at temperatures and buffer conditions of interest, with significantly better accuracy (6- to 14-fold lower s.e.) than prior methods. The equilibrium constant of a reaction with thermodynamics closely approximating that of a desired motif is numerically calculated from directly observed reactant and product equilibrium concentrations; a DNA catalyst is designed to accelerate equilibration. We measured the ΔG° of terminal fluorophores, single-nucleotide dangles and multinucleotide dangles, in temperatures ranging from 10 to 45 °C. PMID:26782977

  8. Discovery of candidate KEN-box motifs using cell cycle keyword enrichment combined with native disorder prediction and motif conservation.

    Science.gov (United States)

    Michael, Sushama; Travé, Gilles; Ramu, Chenna; Chica, Claudia; Gibson, Toby J

    2008-02-15

    KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database-implying that KEN-boxes might be more common than reported. Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research.

  9. A Conserved Metal Binding Motif in the Bacillus subtilis Competence Protein ComFA Enhances Transformation.

    Science.gov (United States)

    Chilton, Scott S; Falbel, Tanya G; Hromada, Susan; Burton, Briana M

    2017-08-01

    Genetic competence is a process in which cells are able to take up DNA from their environment, resulting in horizontal gene transfer, a major mechanism for generating diversity in bacteria. Many bacteria carry homologs of the central DNA uptake machinery that has been well characterized in Bacillus subtilis It has been postulated that the B. subtilis competence helicase ComFA belongs to the DEAD box family of helicases/translocases. Here, we made a series of mutants to analyze conserved amino acid motifs in several regions of B. subtilis ComFA. First, we confirmed that ComFA activity requires amino acid residues conserved among the DEAD box helicases, and second, we show that a zinc finger-like motif consisting of four cysteines is required for efficient transformation. Each cysteine in the motif is important, and mutation of at least two of the cysteines dramatically reduces transformation efficiency. Further, combining multiple cysteine mutations with the helicase mutations shows an additive phenotype. Our results suggest that the helicase and metal binding functions are two distinct activities important for ComFA function during transformation. IMPORTANCE ComFA is a highly conserved protein that has a role in DNA uptake during natural competence, a mechanism for horizontal gene transfer observed in many bacteria. Investigation of the details of the DNA uptake mechanism is important for understanding the ways in which bacteria gain new traits from their environment, such as drug resistance. To dissect the role of ComFA in the DNA uptake machinery, we introduced point mutations into several motifs in the protein sequence. We demonstrate that several amino acid motifs conserved among ComFA proteins are important for efficient transformation. This report is the first to demonstrate the functional requirement of an amino-terminal cysteine motif in ComFA. Copyright © 2017 American Society for Microbiology.

  10. Improved i-motif thermal stability by insertion of anthraquinone monomers

    DEFF Research Database (Denmark)

    Gouda, Alaa S; Amine, Mahasen S.; Pedersen, Erik Bjerregaard

    2017-01-01

    In order to gain insight into how to improve thermal stability of i-motifs when used in the context of biomedical and nanotechnological applications, novel anthraquinone-modified i-motifs were synthesized by insertion of 1,8-, 1,4-, 1,5- and 2,6-disubstituted anthraquinone monomers into the TAA...... loops of a 22mer cytosine-rich human telomeric DNA sequence. The influence of the four anthraquinone linkers on the i-motif thermal stability was investigated at 295 nm and pH 5.5. Anthraquinone monomers modulate the i-motif stability in a position-depending manner and the modulation also depends...... unlocked nucleic acid monomers or twisted intercalating nucleic acid. The 2,6-disubstituted anthraquinone linker replacing T10 enabled a significant increase of i-motif thermal melting by 8.2 °C. A substantial increase of 5.0 °C in i-motif thermal melting was recorded when both A6 and T16 were modified...

  11. Identification of a cis-regulatory element by transient analysis of co-ordinately regulated genes

    Directory of Open Access Journals (Sweden)

    Allan Andrew C

    2008-07-01

    Full Text Available Abstract Background Transcription factors (TFs co-ordinately regulate target genes that are dispersed throughout the genome. This co-ordinate regulation is achieved, in part, through the interaction of transcription factors with conserved cis-regulatory motifs that are in close proximity to the target genes. While much is known about the families of transcription factors that regulate gene expression in plants, there are few well characterised cis-regulatory motifs. In Arabidopsis, over-expression of the MYB transcription factor PAP1 (PRODUCTION OF ANTHOCYANIN PIGMENT 1 leads to transgenic plants with elevated anthocyanin levels due to the co-ordinated up-regulation of genes in the anthocyanin biosynthetic pathway. In addition to the anthocyanin biosynthetic genes, there are a number of un-associated genes that also change in expression level. This may be a direct or indirect consequence of the over-expression of PAP1. Results Oligo array analysis of PAP1 over-expression Arabidopsis plants identified genes co-ordinately up-regulated in response to the elevated expression of this transcription factor. Transient assays on the promoter regions of 33 of these up-regulated genes identified eight promoter fragments that were transactivated by PAP1. Bioinformatic analysis on these promoters revealed a common cis-regulatory motif that we showed is required for PAP1 dependent transactivation. Conclusion Co-ordinated gene regulation by individual transcription factors is a complex collection of both direct and indirect effects. Transient transactivation assays provide a rapid method to identify direct target genes from indirect target genes. Bioinformatic analysis of the promoters of these direct target genes is able to locate motifs that are common to this sub-set of promoters, which is impossible to identify with the larger set of direct and indirect target genes. While this type of analysis does not prove a direct interaction between protein and DNA

  12. Novel structural features drive DNA binding properties of Cmr, a CRP family protein in TB complex mycobacteria.

    Science.gov (United States)

    Ranganathan, Sridevi; Cheung, Jonah; Cassidy, Michael; Ginter, Christopher; Pata, Janice D; McDonough, Kathleen A

    2018-01-09

    Mycobacterium tuberculosis (Mtb) encodes two CRP/FNR family transcription factors (TF) that contribute to virulence, Cmr (Rv1675c) and CRPMt (Rv3676). Prior studies identified distinct chromosomal binding profiles for each TF despite their recognizing overlapping DNA motifs. The present study shows that Cmr binding specificity is determined by discriminator nucleotides at motif positions 4 and 13. X-ray crystallography and targeted mutational analyses identified an arginine-rich loop that expands Cmr's DNA interactions beyond the classical helix-turn-helix contacts common to all CRP/FNR family members and facilitates binding to imperfect DNA sequences. Cmr binding to DNA results in a pronounced asymmetric bending of the DNA and its high level of cooperativity is consistent with DNA-facilitated dimerization. A unique N-terminal extension inserts between the DNA binding and dimerization domains, partially occluding the site where the canonical cAMP binding pocket is found. However, an unstructured region of this N-terminus may help modulate Cmr activity in response to cellular signals. Cmr's multiple levels of DNA interaction likely enhance its ability to integrate diverse gene regulatory signals, while its novel structural features establish Cmr as an atypical CRP/FNR family member. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Sequence-specific DNA binding activity of the cross-brace zinc finger motif of the piggyBac transposase

    Science.gov (United States)

    Morellet, Nelly; Li, Xianghong; Wieninger, Silke A; Taylor, Jennifer L; Bischerour, Julien; Moriau, Séverine; Lescop, Ewen; Bardiaux, Benjamin; Mathy, Nathalie; Assrir, Nadine; Bétermier, Mireille; Nilges, Michael; Hickman, Alison B; Dyda, Fred; Craig, Nancy L; Guittet, Eric

    2018-01-01

    Abstract The piggyBac transposase (PB) is distinguished by its activity and utility in genome engineering, especially in humans where it has highly promising therapeutic potential. Little is known, however, about the structure–function relationships of the different domains of PB. Here, we demonstrate in vitro and in vivo that its C-terminal Cysteine-Rich Domain (CRD) is essential for DNA breakage, joining and transposition and that it binds to specific DNA sequences in the left and right transposon ends, and to an additional unexpectedly internal site at the left end. Using NMR, we show that the CRD adopts the specific fold of the cross-brace zinc finger protein family. We determine the interaction interfaces between the CRD and its target, the 5′-TGCGT-3′/3′-ACGCA-5′ motifs found in the left, left internal and right transposon ends, and use NMR results to propose docking models for the complex, which are consistent with our site-directed mutagenesis data. Our results provide support for a model of the PB/DNA interactions in the context of the transpososome, which will be useful for the rational design of PB mutants with increased activity. PMID:29385532

  14. A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

    Directory of Open Access Journals (Sweden)

    Guido W. Grimm

    2006-01-01

    Full Text Available The multi-copy internal transcribed spacer (ITS region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation instead of the full (partly redundant original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.

  15. A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

    Science.gov (United States)

    Grimm, Guido W.; Renner, Susanne S.; Stamatakis, Alexandros; Hemleben, Vera

    2007-01-01

    The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly. PMID:19455198

  16. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  17. SSTRAP: A computational model for genomic motif discovery ...

    African Journals Online (AJOL)

    Computational methods can potentially provide high-quality prediction of biological molecules such as DNA binding sites and Transcription factors and therefore reduce the time needed for experimental verification and challenges associated with experimental methods. These biological molecules or motifs have significant ...

  18. G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    John A Capra

    2010-07-01

    Full Text Available G-quadruplex DNA is a four-stranded DNA structure formed by non-Watson-Crick base pairing between stacked sets of four guanines. Many possible functions have been proposed for this structure, but its in vivo role in the cell is still largely unresolved. We carried out a genome-wide survey of the evolutionary conservation of regions with the potential to form G-quadruplex DNA structures (G4 DNA motifs across seven yeast species. We found that G4 DNA motifs were significantly more conserved than expected by chance, and the nucleotide-level conservation patterns suggested that the motif conservation was the result of the formation of G4 DNA structures. We characterized the association of conserved and non-conserved G4 DNA motifs in Saccharomyces cerevisiae with more than 40 known genome features and gene classes. Our comprehensive, integrated evolutionary and functional analysis confirmed the previously observed associations of G4 DNA motifs with promoter regions and the rDNA, and it identified several previously unrecognized associations of G4 DNA motifs with genomic features, such as mitotic and meiotic double-strand break sites (DSBs. Conserved G4 DNA motifs maintained strong associations with promoters and the rDNA, but not with DSBs. We also performed the first analysis of G4 DNA motifs in the mitochondria, and surprisingly found a tenfold higher concentration of the motifs in the AT-rich yeast mitochondrial DNA than in nuclear DNA. The evolutionary conservation of the G4 DNA motif and its association with specific genome features supports the hypothesis that G4 DNA has in vivo functions that are under evolutionary constraint.

  19. An approach to evaluate the topological significance of motifs and other patterns in regulatory networks

    Directory of Open Access Journals (Sweden)

    Wingender Edgar

    2009-05-01

    that enables to evaluate the topological significance of various connected patterns in a regulatory network. Applying this method onto transcriptional networks of three largely distinct organisms we could prove that it is highly suitable to identify most important pattern instances, but that neither motifs nor any pattern in general appear to play a particularly important role per se. From the results obtained so far, we conclude that the pairwise disconnectivity index will most likely prove useful as well in identifying other (higher-order pattern instances in transcriptional and other networks.

  20. Specific interaction of the nonstructural protein NS1 of minute virus of mice (MVM) with [ACCA](2) motifs in the centre of the right-end MVM DNA palindrome induces hairpin-primed viral DNA replication.

    Science.gov (United States)

    Willwand, Kurt; Moroianu, Adela; Hörlein, Rita; Stremmel, Wolfgang; Rommelaere, Jean

    2002-07-01

    The linear single-stranded DNA genome of minute virus of mice (MVM) is replicated via a double-stranded replicative form (RF) intermediate DNA. Amplification of viral RF DNA requires the structural transition of the right-end palindrome from a linear duplex into a double-hairpin structure, which serves for the repriming of unidirectional DNA synthesis. This conformational transition was found previously to be induced by the MVM nonstructural protein NS1. Elimination of the cognate NS1-binding sites, [ACCA](2), from the central region of the right-end palindrome next to the axis of symmetry was shown to markedly reduce the efficiency of hairpin-primed DNA replication, as measured in a reconstituted in vitro replication system. Thus, [ACCA](2) sequence motifs are essential as NS1-binding elements in the context of the structural transition of the right-end MVM palindrome.

  1. cDNA cloning, genomic organization and expression analysis during somatic embryogenesis of the translationally controlled tumor protein (TCTP) gene from Japanese larch (Larix leptolepis).

    Science.gov (United States)

    Zhang, Li-Feng; Li, Wan-Feng; Han, Su-Ying; Yang, Wen-Hua; Qi, Li-Wang

    2013-10-15

    A full-length cDNA and genomic sequences of a translationally controlled tumor protein (TCTP) gene were isolated from Japanese larch (Larix leptolepis) and designated LaTCTP. The length of the cDNA was 1, 043 bp and contained a 504 bp open reading frame that encodes a predicted protein of 167 amino acids, characterized by two signature sequences of the TCTP protein family. Analysis of the LaTCTP gene structure indicated four introns and five exons, and it is the largest of all currently known TCTP genes in plants. The 5'-flanking promoter region of LaTCTP was cloned using an improved TAIL-PCR technique. In this region we identified many important potential cis-acting elements, such as a Box-W1 (fungal elicitor responsive element), a CAT-box (cis-acting regulatory element related to meristem expression), a CGTCA-motif (cis-acting regulatory element involved in MeJA-responsiveness), a GT1-motif (light responsive element), a Skn-1-motif (cis-acting regulatory element required for endosperm expression) and a TGA-element (auxin-responsive element), suggesting that expression of LaTCTP is highly regulated. Expression analysis demonstrated ubiquitous localization of LaTCTP mRNA in the roots, stems and needles, high mRNA levels in the embryonal-suspensor mass (ESM), browning embryogenic cultures and mature somatic embryos, and low levels of mRNA at day five during somatic embryogenesis. We suggest that LaTCTP might participate in the regulation of somatic embryo development. These results provide a theoretical basis for understanding the molecular regulatory mechanism of LaTCTP and lay the foundation for artificial regulation of somatic embryogenesis. © 2013.

  2. Mitochondrial and Y chromosome haplotype motifs as diagnostic markers of Jewish ancestry: a reconsideration.

    Directory of Open Access Journals (Sweden)

    Sergio eTofanelli

    2014-11-01

    Full Text Available Several authors have proposed haplotype motifs based on site variants at the mitochondrial genome (mtDNA and the non-recombining portion of the Y chromosome (NRY to trace the genealogies of Jewish people. Here, we analyzed their main approaches and test the feasibility of adopting motifs as ancestry markers through construction of a large database of mtDNA and NRY haplotypes from public genetic genealogical repositories. We verified the reliability of Jewish ancestry prediction based on the Cohen and Levite Modal Haplotypes in their classical 6 STR marker format or in the extended 12 STR format, as well as four founder mtDNA lineages (HVS-I segments accounting for about 40% of the current population of Ashkenazi Jews. For this purpose we compared haplotype composition in individuals of self-reported Jewish ancestry with the rest of European, African or Middle Eastern samples, to test for non-random association of ethno-geographic groups and haplotypes. Overall, NRY and mtDNA based motifs, previously reported to differentiate between groups, were found to be more represented in Jewish compared to non-Jewish groups. However, this seems to stem from common ancestors of Jewish lineages being rather recent respect to ancestors of non-Jewish lineages with the same haplotype signatures. Moreover, the polyphyly of haplotypes which contain the proposed motifs and the misuse of constant mutation rates heavily affected previous attempts to correctly dating the origin of common ancestries. Accordingly, our results stress the limitations of using the above haplotype motifs as reliable Jewish ancestry predictors and show its inadequacy for forensic or genealogical purposes.

  3. Annotating RNA motifs in sequences and alignments.

    Science.gov (United States)

    Gardner, Paul P; Eldai, Hisham

    2015-01-01

    RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure-function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs--RMfam--and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. DNA residence time is a regulatory factor of transcription repression

    Science.gov (United States)

    Clauß, Karen; Popp, Achim P.; Schulze, Lena; Hettich, Johannes; Reisser, Matthias; Escoter Torres, Laura; Uhlenhaut, N. Henriette

    2017-01-01

    Abstract Transcription comprises a highly regulated sequence of intrinsically stochastic processes, resulting in bursts of transcription intermitted by quiescence. In transcription activation or repression, a transcription factor binds dynamically to DNA, with a residence time unique to each factor. Whether the DNA residence time is important in the transcription process is unclear. Here, we designed a series of transcription repressors differing in their DNA residence time by utilizing the modular DNA binding domain of transcription activator-like effectors (TALEs) and varying the number of nucleotide-recognizing repeat domains. We characterized the DNA residence times of our repressors in living cells using single molecule tracking. The residence times depended non-linearly on the number of repeat domains and differed by more than a factor of six. The factors provoked a residence time-dependent decrease in transcript level of the glucocorticoid receptor-activated gene SGK1. Down regulation of transcription was due to a lower burst frequency in the presence of long binding repressors and is in accordance with a model of competitive inhibition of endogenous activator binding. Our single molecule experiments reveal transcription factor DNA residence time as a regulatory factor controlling transcription repression and establish TALE-DNA binding domains as tools for the temporal dissection of transcription regulation. PMID:28977492

  5. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    Science.gov (United States)

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2017-03-17

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. MotifNet: a web-server for network motif analysis.

    Science.gov (United States)

    Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

    2017-06-15

    Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  7. Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

    Directory of Open Access Journals (Sweden)

    Girgis Hani Z

    2012-02-01

    Full Text Available Abstract Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF binding sites (TFBSs. Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was

  8. Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

    Science.gov (United States)

    Roy, Indranil; Aluru, Srinivas

    2016-01-01

    Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.

  9. Protein associations in DnaA-ATP hydrolysis mediated by the Hda-replicase clamp complex.

    Science.gov (United States)

    Su'etsugu, Masayuki; Shimuta, Toh-Ru; Ishida, Takuma; Kawakami, Hironori; Katayama, Tsutomu

    2005-02-25

    In Escherichia coli, the activity of ATP-bound DnaA protein in initiating chromosomal replication is negatively controlled in a replication-coordinated manner. The RIDA (regulatory inactivation of DnaA) system promotes DnaA-ATP hydrolysis to produce the inactivated form DnaA-ADP in a manner depending on the Hda protein and the DNA-loaded form of the beta-sliding clamp, a subunit of the replicase holoenzyme. A highly functional form of Hda was purified and shown to form a homodimer in solution, and two Hda dimers were found to associate with a single clamp molecule. Purified mutant Hda proteins were used in a staged in vitro RIDA system followed by a pull-down assay to show that Hda-clamp binding is a prerequisite for DnaA-ATP hydrolysis and that binding is mediated by an Hda N-terminal motif. Arg(168) in the AAA(+) Box VII motif of Hda plays a role in stable homodimer formation and in DnaA-ATP hydrolysis, but not in clamp binding. Furthermore, the DnaA N-terminal domain is required for the functional interaction of DnaA with the Hda-clamp complex. Single cells contain approximately 50 Hda dimers, consistent with the results of in vitro experiments. These findings and the features of AAA(+) proteins, including DnaA, suggest the following model. DnaA-ATP is hydrolyzed at a binding interface between the AAA(+) domains of DnaA and Hda; the DnaA N-terminal domain supports this interaction; and the interaction of DnaA-ATP with the Hda-clamp complex occurs in a catalytic mode.

  10. Identification of putative cis-regulatory elements in Cryptosporidium parvum by de novo pattern finding

    Directory of Open Access Journals (Sweden)

    Kissinger Jessica C

    2007-01-01

    Full Text Available Abstract Background Cryptosporidium parvum is a unicellular eukaryote in the phylum Apicomplexa. It is an obligate intracellular parasite that causes diarrhea and is a significant AIDS-related pathogen. Cryptosporidium parvum is not amenable to long-term laboratory cultivation or classical molecular genetic analysis. The parasite exhibits a complex life cycle, a broad host range, and fundamental mechanisms of gene regulation remain unknown. We have used data from the recently sequenced genome of this organism to uncover clues about gene regulation in C. parvum. We have applied two pattern finding algorithms MEME and AlignACE to identify conserved, over-represented motifs in the 5' upstream regions of genes in C. parvum. To support our findings, we have established comparative real-time -PCR expression profiles for the groups of genes examined computationally. Results We find that groups of genes that share a function or belong to a common pathway share upstream motifs. Different motifs are conserved upstream of different groups of genes. Comparative real-time PCR studies show co-expression of genes within each group (in sub-sets during the life cycle of the parasite, suggesting co-regulation of these genes may be driven by the use of conserved upstream motifs. Conclusion This is one of the first attempts to characterize cis-regulatory elements in the absence of any previously characterized elements and with very limited expression data (seven genes only. Using de novo pattern finding algorithms, we have identified specific DNA motifs that are conserved upstream of genes belonging to the same metabolic pathway or gene family. We have demonstrated the co-expression of these genes (often in subsets using comparative real-time-PCR experiments thus establishing evidence for these conserved motifs as putative cis-regulatory elements. Given the lack of prior information concerning expression patterns and organization of promoters in C. parvum we

  11. Codon based co-occurrence network motifs in human mitochondria

    Directory of Open Access Journals (Sweden)

    Pramod Shinde

    2017-10-01

    Full Text Available The nucleotide polymorphism in human mitochondrial genome (mtDNA tolled by codon position bias plays an indispensable role in human population dispersion and expansion. Herein, we constructed genome-wide nucleotide co-occurrence networks using a massive data consisting of five different geographical regions and around 3000 samples for each region. We developed a powerful network model to describe complex mitochondrial evolutionary patterns between codon and non-codon positions. It was interesting to report a different evolution of Asian genomes than those of the rest which is divulged by network motifs. We found evidence that mtDNA undergoes substantial amounts of adaptive evolution, a finding which was supported by a number of previous studies. The dominance of higher order motifs indicated the importance of long-range nucleotide co-occurrence in genomic diversity. Most notably, codon motifs apparently underpinned the preferences among codon positions for co-evolution which is probably highly biased during the origin of the genetic code. Our analyses manifested that codon position co-evolution is very well conserved across human sub-populations and independently maintained within human sub-populations implying the selective role of evolutionary processes on codon position co-evolution. Ergo, this study provided a framework to investigate cooperative genomic interactions which are critical in underlying complex mitochondrial evolution.

  12. Transcriptome landscape of Lactococcus lactis reveals many novel RNAs including a small regulatory RNA involved in carbon uptake and metabolism.

    Science.gov (United States)

    van der Meulen, Sjoerd B; de Jong, Anne; Kok, Jan

    2016-01-01

    RNA sequencing has revolutionized genome-wide transcriptome analyses, and the identification of non-coding regulatory RNAs in bacteria has thus increased concurrently. Here we reveal the transcriptome map of the lactic acid bacterial paradigm Lactococcus lactis MG1363 by employing differential RNA sequencing (dRNA-seq) and a combination of manual and automated transcriptome mining. This resulted in a high-resolution genome annotation of L. lactis and the identification of 60 cis-encoded antisense RNAs (asRNAs), 186 trans-encoded putative regulatory RNAs (sRNAs) and 134 novel small ORFs. Based on the putative targets of asRNAs, a novel classification is proposed. Several transcription factor DNA binding motifs were identified in the promoter sequences of (a)sRNAs, providing insight in the interplay between lactococcal regulatory RNAs and transcription factors. The presence and lengths of 14 putative sRNAs were experimentally confirmed by differential Northern hybridization, including the abundant RNA 6S that is differentially expressed depending on the available carbon source. For another sRNA, LLMGnc_147, functional analysis revealed that it is involved in carbon uptake and metabolism. L. lactis contains 13% leaderless mRNAs (lmRNAs) that, from an analysis of overrepresentation in GO classes, seem predominantly involved in nucleotide metabolism and DNA/RNA binding. Moreover, an A-rich sequence motif immediately following the start codon was uncovered, which could provide novel insight in the translation of lmRNAs. Altogether, this first experimental genome-wide assessment of the transcriptome landscape of L. lactis and subsequent sRNA studies provide an extensive basis for the investigation of regulatory RNAs in L. lactis and related lactococcal species.

  13. Crystal Structure of the Dimeric Oct6 (Pou3fl) POU Domain Bound to Palindromic MORE DNA

    Energy Technology Data Exchange (ETDEWEB)

    R Jauch; S Choo; C Ng; P Kolatkar

    2011-12-31

    POU domains (named after their identification in Pit1, Oct1 unc86) are found in around 15 transcription factors encoded in mammalian genomes many of which feature prominently as key regulators at development bifurcations. For example, the POU III class Octamer binding protein 6 (Oct6) is expressed in embryonic stem cells and during neural development and drives the differentia5tion of myelinated cells in the central and peripheral nervous system. Defects in oct6 expression levels are linked to neurological disorders such as schizophrenia. POU proteins contain a bi-partite DNA binding domain that assembles on various DNA motifs with differentially configured subdomains. Intriguingly, alternative configurations of POU domains on different DNA sites were shown to affect the subsequent recruitment of transcriptional coactivators. Namely, binding of Oct1 to a Palindromic Oct-factor Recognition Element (PORE) was shown to facilitate the recruitment of the OBF1 coactivator whereas More of PORE (MORE) bound Oct1 does not. Moreover, Pit1 was shown to recruit the corepressor N-CoR only when bound to a variant MORE motif with a 2 bp half-site spacing. Therefore, POU proteins are seen as a paradigm for DNA induced allosteric effects on transcription factors modulating their regulatory potential. However, a big unresolved conundrum for the POU class and for most if not all other transcription factor classes is how highly similar proteins regulate different sets of genes causing fundamentally different biological responses. Ultimately, there must be subtle features enabling those factors to engage in contrasting molecular interactions in the cell. Thus, the dissection of the molecular details of the transcription-DNA recognition in general, and the formation of multimeric regulatory complexes, in particular, is highly desirable. To contribute to these efforts they solved the 2.05 {angstrom} crystal structure of Oct6 bound as a symmetrical homodimer to palindromic MORE DNA.

  14. BayesMotif: de novo protein sorting motif discovery from impure datasets.

    Science.gov (United States)

    Hu, Jianjun; Zhang, Fan

    2010-01-18

    Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of

  15. Microbial expression of proteins containing long repetitive Arg-Gly-Asp cell adhesive motifs created by overlap elongation PCR

    International Nuclear Information System (INIS)

    Kurihara, Hiroyuki; Shinkai, Masashige; Nagamune, Teruyuki

    2004-01-01

    We developed a novel method for creating repetitive DNA libraries using overlap elongation PCR, and prepared a DNA library encoding repetitive Arg-Gly-Asp (RGD) cell adhesive motifs. We obtained various length DNAs encoding repetitive RGD from a short monomer DNA (18 bp) after a thermal cyclic reaction without a DNA template for amplification, and isolated DNAs encoding 2, 21, and 43 repeats of the RGD motif. We cloned these DNAs into a protein expression vector and overexpressed them as thioredoxin fusion proteins: RGD2, RGD21, and RGD43, respectively. The solubility of RGD43 in water was low and it formed a fibrous precipitate in water. Scanning electron microscopy revealed that RGD43 formed a branched 3D-network structure in the solid state. To evaluate the function of the cell adhesive motifs in RGD43, mouse fibroblast cells were cultivated on the RGD43 scaffold. The fibroblast cells adhered to the RGD43 scaffold and extended long filopodia

  16. DNA polymerase preference determines PCR priming efficiency.

    Science.gov (United States)

    Pan, Wenjing; Byrne-Steele, Miranda; Wang, Chunlin; Lu, Stanley; Clemmons, Scott; Zahorchak, Robert J; Han, Jian

    2014-01-30

    Polymerase chain reaction (PCR) is one of the most important developments in modern biotechnology. However, PCR is known to introduce biases, especially during multiplex reactions. Recent studies have implicated the DNA polymerase as the primary source of bias, particularly initiation of polymerization on the template strand. In our study, amplification from a synthetic library containing a 12 nucleotide random portion was used to provide an in-depth characterization of DNA polymerase priming bias. The synthetic library was amplified with three commercially available DNA polymerases using an anchored primer with a random 3' hexamer end. After normalization, the next generation sequencing (NGS) results of the amplified libraries were directly compared to the unamplified synthetic library. Here, high throughput sequencing was used to systematically demonstrate and characterize DNA polymerase priming bias. We demonstrate that certain sequence motifs are preferred over others as primers where the six nucleotide sequences at the 3' end of the primer, as well as the sequences four base pairs downstream of the priming site, may influence priming efficiencies. DNA polymerases in the same family from two different commercial vendors prefer similar motifs, while another commercially available enzyme from a different DNA polymerase family prefers different motifs. Furthermore, the preferred priming motifs are GC-rich. The DNA polymerase preference for certain sequence motifs was verified by amplification from single-primer templates. We incorporated the observed DNA polymerase preference into a primer-design program that guides the placement of the primer to an optimal location on the template. DNA polymerase priming bias was characterized using a synthetic library amplification system and NGS. The characterization of DNA polymerase priming bias was then utilized to guide the primer-design process and demonstrate varying amplification efficiencies among three commercially

  17. Mutation analysis of the human CYP3A4 gene 5' regulatory region: population screening using non-radioactive SSCP.

    Science.gov (United States)

    Hamzeiy, Hossein; Vahdati-Mashhadian, Nasser; Edwards, Helen J; Goldfarb, Peter S

    2002-03-20

    Human CYP3A4 is the major cytochrome P450 isoenzyme in adult human liver and is known to metabolise many xenobiotic and endogenous compounds. There is substantial inter-individual variation in the hepatic levels of CYP3A4. Although, polymorphic mutations have been reported in the 5' regulatory region of the CYP3A4 gene, those that have been investigated so far do not appear to have any effect on gene expression. To determine whether other mutations exist in this region of the gene, we have performed a new population screen on a panel of 101 human DNA samples. A 1140 bp section of the 5' proximal regulatory region of the CYP3A4 gene, containing numerous regulatory motifs, was amplified from genomic DNA as three overlapping segments. The 300 bp distal enhancer region at -7.9kb containing additional regulatory motifs was also amplified. Mutation analysis of the resulting PCR products was carried out using non-radioactive single strand conformation polymorphism (SSCP) and confirmatory sequencing of both DNA strands in those samples showing extra SSCP bands. In addition to detection of the previously reported CYP3A4*1B allele in nine subjects, three novel alleles were found: CYP3A4*1E (having a T-->A transversion at -369 in one subject), CYP3A4*1F (having a C-->G tranversion at -747 in 17 subjects) and CYP3A4*15B containing a nine-nucleotide insertion between -845 and -844 linked to an A-->G transition at -392 and a G-->A transition in exon 6 (position 485 in the cDNA) in one subject. All the novel alleles were heterozygous. No mutations were found in the upstream distal enhancer region. Our results clearly indicate that this rapid and simple SSCP approach can reveal mutant alleles in drug metabolising enzyme genes. Detection and determination of the frequency of novel alleles in CYP3A4 will assist investigation of the relationship between genotype, xenobiotic metabolism and toxicity in the CYP3A family of isoenzymes.

  18. Distinct configurations of protein complexes and biochemical pathways revealed by epistatic interaction network motifs

    LENUS (Irish Health Repository)

    Casey, Fergal

    2011-08-22

    Abstract Background Gene and protein interactions are commonly represented as networks, with the genes or proteins comprising the nodes and the relationship between them as edges. Motifs, or small local configurations of edges and nodes that arise repeatedly, can be used to simplify the interpretation of networks. Results We examined triplet motifs in a network of quantitative epistatic genetic relationships, and found a non-random distribution of particular motif classes. Individual motif classes were found to be associated with different functional properties, suggestive of an underlying biological significance. These associations were apparent not only for motif classes, but for individual positions within the motifs. As expected, NNN (all negative) motifs were strongly associated with previously reported genetic (i.e. synthetic lethal) interactions, while PPP (all positive) motifs were associated with protein complexes. The two other motif classes (NNP: a positive interaction spanned by two negative interactions, and NPP: a negative spanned by two positives) showed very distinct functional associations, with physical interactions dominating for the former but alternative enrichments, typical of biochemical pathways, dominating for the latter. Conclusion We present a model showing how NNP motifs can be used to recognize supportive relationships between protein complexes, while NPP motifs often identify opposing or regulatory behaviour between a gene and an associated pathway. The ability to use motifs to point toward underlying biological organizational themes is likely to be increasingly important as more extensive epistasis mapping projects in higher organisms begin.

  19. The AT-Hook motif as a versatile minor groove anchor for promoting DNA binding of transcription factor fragments? ?Electronic supplementary information (ESI) available: Peptide synthesis, full experimental procedures and analytical data of the peptides and products obtained. See DOI: 10.1039/c5sc01415h Click here for additional data file.

    OpenAIRE

    Rodr?guez, J?ssica; Mosquera, Jes?s; Couceiro, Jose R.; V?zquez, M. Eugenio; Mascare?as, Jos? L.

    2015-01-01

    We report the development of chimeric DNA binding peptides comprising a DNA binding fragment of natural transcription factors (the basic region of a bZIP protein or a monomeric zinc finger module) and an AT-Hook peptide motif. The resulting peptide conjugates display high DNA affinity and excellent sequence selectivity. Furthermore, the AT-Hook motif also favors the cell internalization of the conjugates.

  20. The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

    Science.gov (United States)

    Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

    2015-01-01

    Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227

  1. Insights into the Pathogenesis of Anaplastic Large-Cell Lymphoma through Genome-wide DNA Methylation Profiling

    Directory of Open Access Journals (Sweden)

    Melanie R. Hassler

    2016-10-01

    Full Text Available Aberrant DNA methylation patterns in malignant cells allow insight into tumor evolution and development and can be used for disease classification. Here, we describe the genome-wide DNA methylation signatures of NPM-ALK-positive (ALK+ and NPM-ALK-negative (ALK− anaplastic large-cell lymphoma (ALCL. We find that ALK+ and ALK− ALCL share common DNA methylation changes for genes involved in T cell differentiation and immune response, including TCR and CTLA-4, without an ALK-specific impact on tumor DNA methylation in gene promoters. Furthermore, we uncover a close relationship between global ALCL DNA methylation patterns and those in distinct thymic developmental stages and observe tumor-specific DNA hypomethylation in regulatory regions that are enriched for conserved transcription factor binding motifs such as AP1. Our results indicate similarity between ALCL tumor cells and thymic T cell subsets and a direct relationship between ALCL oncogenic signaling and DNA methylation through transcription factor induction and occupancy.

  2. The crystal structure of the Sox4 HMG domain-DNA complex suggests a mechanism for positional interdependence in DNA recognition.

    Science.gov (United States)

    Jauch, Ralf; Ng, Calista K L; Narasimhan, Kamesh; Kolatkar, Prasanna R

    2012-04-01

    It has recently been proposed that the sequence preferences of DNA-binding TFs (transcription factors) can be well described by models that include the positional interdependence of the nucleotides of the target sites. Such binding models allow for multiple motifs to be invoked, such as principal and secondary motifs differing at two or more nucleotide positions. However, the structural mechanisms underlying the accommodation of such variant motifs by TFs remain elusive. In the present study we examine the crystal structure of the HMG (high-mobility group) domain of Sox4 [Sry (sex-determining region on the Y chromosome)-related HMG box 4] bound to DNA. By comparing this structure with previously solved structures of Sox17 and Sox2, we observed subtle conformational differences at the DNA-binding interface. Furthermore, using quantitative electrophoretic mobility-shift assays we validated the positional interdependence of two nucleotides and the presence of a secondary Sox motif in the affinity landscape of Sox4. These results suggest that a concerted rearrangement of two interface amino acids enables Sox4 to accommodate primary and secondary motifs. The structural adaptations lead to altered dinucleotide preferences that mutually reinforce each other. These analyses underline the complexity of the DNA recognition by TFs and provide an experimental validation for the conceptual framework of positional interdependence and secondary binding motifs.

  3. UvrD in Deinococcus radiodurans is optimized for processing G-quadruplex DNA

    International Nuclear Information System (INIS)

    Das, Anubrata; Misra, H.S.

    2015-01-01

    Deinococcus radiodurans R1 is a radiation resistant Gram-positive bacterium capable of tolerating very high doses of DNA-damaging agents such as gamma radiation (D10 ∼ 12kGy) desiccation (∼ 5% relative humidity), UVC radiation (D10 ∼ 800J/m 2 ) and hydrogen peroxide (40 mM). It achieves this by using a complex regulatory mechanism and novel proteins. Recently bioinformatic analysis showed several stretches of guanine runs in D.radiodurans genome, which could form G-quartets. The role of G-quartets in regulatory processes is well documented in various organisms. The presence of G -quartets in D. radiodurans means that there are regulatory or structural proteins which would bind to these elements. Several proteins are known to bind G-quartets. Finding the proteins which would bind to G4 DNA is difficult as no specific motifs are available for binding these elements. Also most of the known proteins that are shown to bind to G-quadruplex DNA are of eukaryotic nature. To overcome these challenges we defined a set of known G-quadruplex binding proteins and used a smith-waterman algorithm with our own scoring matrix to homologs of G-quadruplex binding proteins in D.radiodurans. Using bioinformatics analysis, we showed that UvrD (DR 1775) of D. radiodurans has ability to bind/translocate along G-quadruplex DNA, a novel feature in prokaryotes. The translocase activity of DR1775 is ATP specific and this ATPase activity is attenuated by ssDNA. Data supporting UvrD of D. radiodurans as a G-quadruplex DNA metabolizing proteins would be presented. (author)

  4. Functional and structural analysis of the DNA sequence conferring glucocorticoid inducibility to the mouse mammary tumor virus gene

    International Nuclear Information System (INIS)

    Skroch, P.

    1987-05-01

    In the first part of my thesis I show that the DNA element conferring glucocorticoid inducibility to the Mouse Mammary Tumor Virus (HRE) has enhancer properties. It activates a heterologous promoter - that of the β-globin gene, independently of distance, position and orientation. These properties however have to be regarded in relation to the remaining regulatory elements of the activated gene as the recombinants between HRE and the TK gene have demonstrated. In the second part of my thesis I investigated the biological significance of certain sequence motifs of the HRE, which are remarkable by their interaction with transacting factors or sequence homologies with other regulatory DNA elements. I could confirm the generally postulated modular structure of enhancers for the HRE and bring the relevance of the single subdomains for the function of the element into relationship. (orig.) [de

  5. Engagement of Components of DNA-Break Repair Complex and NFκB in Hsp70A1A Transcription Upregulation by Heat Shock.

    Science.gov (United States)

    Hazra, Joyita; Mukherjee, Pooja; Ali, Asif; Poddar, Soumita; Pal, Mahadeb

    2017-01-01

    An involvement of components of DNA-break repair (DBR) complex including DNA-dependent protein kinase (DNA-PK) and poly-ADP-ribose polymerase 1 (PARP-1) in transcription regulation in response to distinct cellular signalling has been revealed by different laboratories. Here, we explored the involvement of DNA-PK and PARP-1 in the heat shock induced transcription of Hsp70A1A. We find that inhibition of both the catalytic subunit of DNA-PK (DNA-PKc), and Ku70, a regulatory subunit of DNA-PK holo-enzyme compromises transcription of Hsp70A1A under heat shock treatment. In immunoprecipitation based experiments we find that Ku70 or DNA-PK holoenzyme associates with NFκB. This NFκB associated complex also carries PARP-1. Downregulation of both NFκB and PARP-1 compromises Hsp70A1A transcription induced by heat shock treatment. Alteration of three bases by site directed mutagenesis within the consensus κB sequence motif identified on the promoter affected inducibility of Hsp70A1A transcription by heat shock treatment. These results suggest that NFκB engaged with the κB motif on the promoter cooperates in Hsp70A1A activation under heat shock in human cells as part of a DBR complex including DNA-PK and PARP-1.

  6. Arabidopsis DNA methyltransferase AtDNMT2 associates with histone deacetylase AtHD2s activity

    International Nuclear Information System (INIS)

    Song, Yuan; Wu, Keqiang; Dhaubhadel, Sangeeta; An, Lizhe; Tian, Lining

    2010-01-01

    DNA methyltransferase2 (DNMT2) is always deemed to be enigmatic, because it contains highly conserved DNA methyltransferase motifs but lacks the DNA methylation catalytic capability. Here we show that Arabidopsis DNA methyltransferase2 (AtDNMT2) is localized in nucleus and associates with histone deacetylation. Bimolecular fluorescence complementation and pull-down assays show AtDNMT2 interacts with type-2 histone deacetylases (AtHD2s), a unique type of histone deacetylase family in plants. Through analyzing the expression of AtDNMT2: ss-glucuronidase (GUS) fusion protein, we demonstrate that AtDNMT2 has the ability to repress gene expression at transcription level. Meanwhile, the expression of AtDNMT2 gene is altered in athd2c mutant plants. We propose that AtDNMT2 possibly involves in the activity of histone deacetylation and plant epigenetic regulatory network.

  7. Arabidopsis DNA methyltransferase AtDNMT2 associates with histone deacetylase AtHD2s activity

    Energy Technology Data Exchange (ETDEWEB)

    Song, Yuan [Key Laboratory of Arid and Grassland Agroecology, Ministry of Education, School of Life Science, Lanzhou University, Lanzhou 730000 (China); Southern Crop Protection and Food Research Centre, Agriculture and Agri-Food Canada, 1391 Sandford Street, London, ON, Canada N5V4T3 (Canada); Wu, Keqiang [Institute of Plant Biology, National Taiwan University, Taipei 106, Taiwan (China); Dhaubhadel, Sangeeta [Southern Crop Protection and Food Research Centre, Agriculture and Agri-Food Canada, 1391 Sandford Street, London, ON, Canada N5V4T3 (Canada); An, Lizhe, E-mail: lizhean@lzu.edu.cn [Key Laboratory of Arid and Grassland Agroecology, Ministry of Education, School of Life Science, Lanzhou University, Lanzhou 730000 (China); Tian, Lining, E-mail: tianl@agr.gc.ca [Southern Crop Protection and Food Research Centre, Agriculture and Agri-Food Canada, 1391 Sandford Street, London, ON, Canada N5V4T3 (Canada)

    2010-05-28

    DNA methyltransferase2 (DNMT2) is always deemed to be enigmatic, because it contains highly conserved DNA methyltransferase motifs but lacks the DNA methylation catalytic capability. Here we show that Arabidopsis DNA methyltransferase2 (AtDNMT2) is localized in nucleus and associates with histone deacetylation. Bimolecular fluorescence complementation and pull-down assays show AtDNMT2 interacts with type-2 histone deacetylases (AtHD2s), a unique type of histone deacetylase family in plants. Through analyzing the expression of AtDNMT2: ss-glucuronidase (GUS) fusion protein, we demonstrate that AtDNMT2 has the ability to repress gene expression at transcription level. Meanwhile, the expression of AtDNMT2 gene is altered in athd2c mutant plants. We propose that AtDNMT2 possibly involves in the activity of histone deacetylation and plant epigenetic regulatory network.

  8. Architecture of the 99 bp DNA-six-protein regulatory complex of the lambda att site.

    Science.gov (United States)

    Sun, Xingmin; Mierke, Dale F; Biswas, Tapan; Lee, Sang Yeol; Landy, Arthur; Radman-Livaja, Marta

    2006-11-17

    The highly directional and tightly regulated recombination reaction used to site-specifically excise the bacteriophage lambda chromosome out of its E. coli host chromosome requires the binding of six sequence-specific proteins to a 99 bp segment of the phage att site. To gain structural insights into this recombination pathway, we measured 27 FRET distances between eight points on the 99 bp regulatory DNA bound with all six proteins. Triangulation of these distances using a metric matrix distance-geometry algorithm provided coordinates for these eight points. The resulting path for the protein-bound regulatory DNA, which fits well with the genetics, biochemistry, and X-ray crystal structures describing the individual proteins and their interactions with DNA, provides a new structural perspective into the molecular mechanism and regulation of the recombination reaction and illustrates a design by which different families of higher-order complexes can be assembled from different numbers and combinations of the same few proteins.

  9. G =  MAT: linking transcription factor expression and DNA binding data.

    Science.gov (United States)

    Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak

    2011-01-31

    Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.

  10. G = MAT: Linking Transcription Factor Expression and DNA Binding Data

    Science.gov (United States)

    Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak

    2011-01-01

    Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/. PMID:21297945

  11. Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach

    Directory of Open Access Journals (Sweden)

    Buer Jan

    2004-12-01

    Full Text Available Abstract Background Cellular functions are coordinately carried out by groups of genes forming functional modules. Identifying such modules in the transcriptional regulatory network (TRN of organisms is important for understanding the structure and function of these fundamental cellular networks and essential for the emerging modular biology. So far, the global connectivity structure of TRN has not been well studied and consequently not applied for the identification of functional modules. Moreover, network motifs such as feed forward loop are recently proposed to be basic building blocks of TRN. However, their relationship to functional modules is not clear. Results In this work we proposed a top-down approach to identify modules in the TRN of E. coli. By studying the global connectivity structure of the regulatory network, we first revealed a five-layer hierarchical structure in which all the regulatory relationships are downward. Based on this regulatory hierarchy, we developed a new method to decompose the regulatory network into functional modules and to identify global regulators governing multiple modules. As a result, 10 global regulators and 39 modules were identified and shown to have well defined functions. We then investigated the distribution and composition of the two basic network motifs (feed forward loop and bi-fan motif in the hierarchical structure of TRN. We found that most of these network motifs include global regulators, indicating that these motifs are not basic building blocks of modules since modules should not contain global regulators. Conclusion The transcriptional regulatory network of E. coli possesses a multi-layer hierarchical modular structure without feedback regulation at transcription level. This hierarchical structure builds the basis for a new and simple decomposition method which is suitable for the identification of functional modules and global regulators in the transcriptional regulatory network of E

  12. A deeper look into transcription regulatory code by preferred pair distance templates for transcription factor binding sites

    KAUST Repository

    Kulakovskiy, Ivan V.

    2011-08-18

    Motivation: Modern experimental methods provide substantial information on protein-DNA recognition. Studying arrangements of transcription factor binding sites (TFBSs) of interacting transcription factors (TFs) advances understanding of the transcription regulatory code. Results: We constructed binding motifs for TFs forming a complex with HIF-1α at the erythropoietin 3\\'-enhancer. Corresponding TFBSs were predicted in the segments around transcription start sites (TSSs) of all human genes. Using the genome-wide set of regulatory regions, we observed several strongly preferred distances between hypoxia-responsive element (HRE) and binding sites of a particular cofactor protein. The set of preferred distances was called as a preferred pair distance template (PPDT). PPDT dramatically depended on the TF and orientation of its binding sites relative to HRE. PPDT evaluated from the genome-wide set of regulatory sequences was used to detect significant PPDT-consistent binding site pairs in regulatory regions of hypoxia-responsive genes. We believe PPDT can help to reveal the layout of eukaryotic regulatory segments. © The Author 2011. Published by Oxford University Press. All rights reserved.

  13. Parallel motif extraction from very long sequences

    KAUST Repository

    Sahli, Majed

    2013-01-01

    Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern applications require mining of motifs in one very long sequence (i.e., in the order of several gigabytes). For this case, there exist statistical approaches that are fast but inaccurate; or combinatorial methods that are sound and complete. Unfortunately, existing combinatorial methods are serial and very slow. Consequently, they are limited to very short sequences (i.e., a few megabytes), small alphabets (typically 4 symbols for DNA sequences), and restricted types of motifs. This paper presents ACME, a combinatorial method for extracting motifs from a single very long sequence. ACME arranges the search space in contiguous blocks that take advantage of the cache hierarchy in modern architectures, and achieves almost an order of magnitude performance gain in serial execution. It also decomposes the search space in a smart way that allows scalability to thousands of processors with more than 90% speedup. ACME is the only method that: (i) scales to gigabyte-long sequences; (ii) handles large alphabets; (iii) supports interesting types of motifs with minimal additional cost; and (iv) is optimized for a variety of architectures such as multi-core systems, clusters in the cloud, and supercomputers. ACME reduces the extraction time for an exact-length query from 4 hours to 7 minutes on a typical workstation; handles 3 orders of magnitude longer sequences; and scales up to 16, 384 cores on a supercomputer. Copyright is held by the owner/author(s).

  14. The regulatory effects of low-dose ionizing radiation on Ikaros-autotaxin interaction

    Energy Technology Data Exchange (ETDEWEB)

    Kang, Hana; Cho, Seong Jun; Kim, Sung Jin; Nam, Seon Young; Yang, Kwang Hee [KHNP Radiation Health Institute, Korea Hydro and Nuclear Power Co, Seoul (Korea, Republic of)

    2016-11-15

    Ikaros, a transcription factor containing zinc-finger motif, has known as a critical regulator of hematopoiesis in immune system. Ikaros protein modulates the transcription of target genes via binding to the regulatory elements of the genes promoters. However the regulatory function of Ikaros in other organelle except nuclear remains to be determined. This study explored radiation-induced modulatory function of Ikaros in cytoplasm. The results showed that Ikaros protein lost its DNA binding ability after LDIR (low-dose ionizing radiation) exposure. Cell fractionation and Western blot analysis showed that Ikaros protein was translocated into cytoplasm from nuclear by LDIR. This was confirmed by immunofluorescence assay. We identified Autotaxin as a novel protein which potentially interacts with Ikaros through in vitro protein-binding screening. Co-immunoprecipitation assay revealed that Ikaros and Autotaxin are able to bind each other. Autotaxin is a crucial enzyme generating lysophosphatidic acid (LPA), a phospholipid mediator, which has potential regulatory effects on immune cell growth and motility. Our results indicate that LDIR potentially regulates immune system via protein-protein interaction of Ikaros and Autotaxin.

  15. Sequence-specific high mobility group box factors recognize 10-12-base pair minor groove motifs

    DEFF Research Database (Denmark)

    van Beest, M; Dooijes, D; van De Wetering, M

    2000-01-01

    Sequence-specific high mobility group (HMG) box factors bind and bend DNA via interactions in the minor groove. Three-dimensional NMR analyses have provided the structural basis for this interaction. The cognate HMG domain DNA motif is generally believed to span 6-8 bases. However, alignment...

  16. Mutational analysis of the RecJ exonuclease of Escherichia coli: identification of phosphoesterase motifs.

    Science.gov (United States)

    Sutera, V A; Han, E S; Rajman, L A; Lovett, S T

    1999-10-01

    The recJ gene, identified in Escherichia coli, encodes a Mg(+2)-dependent 5'-to-3' exonuclease with high specificity for single-strand DNA. Genetic and biochemical experiments implicate RecJ exonuclease in homologous recombination, base excision, and methyl-directed mismatch repair. Genes encoding proteins with strong similarities to RecJ have been found in every eubacterial genome sequenced to date, with the exception of Mycoplasma and Mycobacterium tuberculosis. Multiple genes encoding proteins similar to RecJ are found in some eubacteria, including Bacillus and Helicobacter, and in the archaea. Among this divergent set of sequences, seven conserved motifs emerge. We demonstrate here that amino acids within six of these motifs are essential for both the biochemical and genetic functions of E. coli RecJ. These motifs may define interactions with Mg(2+) ions or substrate DNA. A large family of proteins more distantly related to RecJ is present in archaea, eubacteria, and eukaryotes, including a hypothetical protein in the MgPa adhesin operon of Mycoplasma, a domain of putative polyA polymerases in Synechocystis and Aquifex, PRUNE of Drosophila, and an exopolyphosphatase (PPX1) of Saccharomyces cereviseae. Because these six RecJ motifs are shared between exonucleases and exopolyphosphatases, they may constitute an ancient phosphoesterase domain now found in all kingdoms of life.

  17. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    Directory of Open Access Journals (Sweden)

    Roberts Richard J

    2008-05-01

    Full Text Available Abstract Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360, cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases.

  18. Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints.

    Science.gov (United States)

    Schwessinger, Ron; Suciu, Maria C; McGowan, Simon J; Telenius, Jelena; Taylor, Stephen; Higgs, Doug R; Hughes, Jim R

    2017-10-01

    In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k -mer-based analysis of DNase footprints to determine any k -mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome. © 2017 Schwessinger et al.; Published by Cold Spring Harbor Laboratory Press.

  19. G =  MAT: linking transcription factor expression and DNA binding data.

    Directory of Open Access Journals (Sweden)

    Konstantin Tretyakov

    Full Text Available Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.

  20. Manipulation of EphB2 regulatory motifs and SH2 binding sites switches MAPK signaling and biological activity.

    Science.gov (United States)

    Tong, Jiefei; Elowe, Sabine; Nash, Piers; Pawson, Tony

    2003-02-21

    Signaling by the Eph family of receptor tyrosine kinases (RTKs) is complex, because they can interact with a variety of intracellular targets, and can potentially induce distinct responses in different cell types. In NG108 neuronal cells, activated EphB2 recruits p120RasGAP, in a fashion that is associated with down-regulation of the Ras-Erk mitogen-activated kinase (MAPK) pathway and neurite retraction. To pursue the role of the Ras-MAPK pathway in EphB2-mediated growth cone collapse, and to explore the biochemical and biological functions of Eph receptors, we sought to re-engineer the signaling properties of EphB2 by manipulating its regulatory motifs and SH2 binding sites. An EphB2 mutant that retained juxtamembrane (JM) RasGAP binding sites but incorporated a Grb2 binding motif at an alternate RasGAP binding site within the kinase domain had little effect on basal Erk MAPK activation. In contrast, elimination of all RasGAP binding sites, accompanied by the addition of a Grb2 binding site within the kinase domain, led to an increase in phospho-Erk levels in NG108 cells following ephrin-B1 stimulation. Functional assays indicated a correlation between neurite retraction and the ability of the EphB2 mutants to down-regulate Ras-Erk MAPK signaling. These data suggest that EphB2 can be designed to repress, stabilize, or activate the Ras-Erk MAPK pathway by the manipulation of RasGAP and Grb2 SH2 domain binding sites and support the notion that Erk MAPK regulation plays a significant role in axon guidance. The behavior of EphB2 variants with mutations in the JM region and kinase domains suggests an intricate pattern of regulation and target recognition by Eph receptors.

  1. Poxvirus uracil-DNA glycosylase-An unusual member of the family I uracil-DNA glycosylases: Poxvirus Uracil-DNA Glycosylase

    Energy Technology Data Exchange (ETDEWEB)

    Schormann, Norbert [Department of Medicine, University of Alabama at Birmingham, Birmingham Alabama 35294; Zhukovskaya, Natalia [Department of Microbiology, School of Dental Medicine, University of Pennsylvania, Philadelphia Pennsylvania 19104; Bedwell, Gregory [Department of Microbiology, University of Alabama at Birmingham, Birmingham Alabama 35294; Nuth, Manunya [Department of Microbiology, School of Dental Medicine, University of Pennsylvania, Philadelphia Pennsylvania 19104; Gillilan, Richard [MacCHESS (Macromolecular Diffraction Facility at CHESS) Cornell University, Ithaca New York 14853; Prevelige, Peter E. [Department of Microbiology, University of Alabama at Birmingham, Birmingham Alabama 35294; Ricciardi, Robert P. [Department of Microbiology, School of Dental Medicine, University of Pennsylvania, Philadelphia Pennsylvania 19104; Abramson Cancer Center, School of Medicine, University of Pennsylvania, Philadelphia Pennsylvania 19104; Banerjee, Surajit [Department of Chemistry and Chemical Biology, Cornell University, and NE-CAT Argonne Illinois 60439; Chattopadhyay, Debasish [Department of Medicine, University of Alabama at Birmingham, Birmingham Alabama 35294

    2016-11-02

    We report that uracil-DNA glycosylases are ubiquitous enzymes, which play a key role repairing damages in DNA and in maintaining genomic integrity by catalyzing the first step in the base excision repair pathway. Within the superfamily of uracil-DNA glycosylases family I enzymes or UNGs are specific for recognizing and removing uracil from DNA. These enzymes feature conserved structural folds, active site residues and use common motifs for DNA binding, uracil recognition and catalysis. Within this family the enzymes of poxviruses are unique and most remarkable in terms of amino acid sequences, characteristic motifs and more importantly for their novel non-enzymatic function in DNA replication. UNG of vaccinia virus, also known as D4, is the most extensively characterized UNG of the poxvirus family. D4 forms an unusual heterodimeric processivity factor by attaching to a poxvirus-specific protein A20, which also binds to the DNA polymerase E9 and recruits other proteins necessary for replication. D4 is thus integrated in the DNA polymerase complex, and its DNA-binding and DNA scanning abilities couple DNA processivity and DNA base excision repair at the replication fork. In conclusion, the adaptations necessary for taking on the new function are reflected in the amino acid sequence and the three-dimensional structure of D4. We provide an overview of the current state of the knowledge on the structure-function relationship of D4.

  2. A saturation screen for cis-acting regulatory DNA in the Hox genes of Ciona intestinalis

    Energy Technology Data Exchange (ETDEWEB)

    Keys, David N.; Lee, Byung-in; Di Gregorio, Anna; Harafuji, Naoe; Detter, Chris; Wang, Mei; Kahsai, Orsalem; Ahn, Sylvia; Arellano, Andre; Zhang, Quin; Trong, Stephan; Doyle, Sharon A.; Satoh, Noriyuki; Satou, Yutaka; Saiga, Hidetoshi; Christian, Allen; Rokhsar, Dan; Hawkins, Trevor L.; Levine, Mike; Richardson, Paul

    2005-01-05

    A screen for the systematic identification of cis-regulatory elements within large (>100 kb) genomic domains containing Hox genes was performed by using the basal chordate Ciona intestinalis. Randomly generated DNA fragments from bacterial artificial chromosomes containing two clusters of Hox genes were inserted into a vector upstream of a minimal promoter and lacZ reporter gene. A total of 222 resultant fusion genes were separately electroporated into fertilized eggs, and their regulatory activities were monitored in larvae. In sum, 21 separable cis-regulatory elements were found. These include eight Hox linked domains that drive expression in nested anterior-posterior domains of ectodermally derived tissues. In addition to vertebrate-like CNS regulation, the discovery of cis-regulatory domains that drive epidermal transcription suggests that C. intestinalis has arthropod-like Hox patterning in the epidermis.

  3. Defining the plasticity of transcription factor binding sites by Deconstructing DNA consensus sequences: the PhoP-binding sites among gamma/enterobacteria.

    Directory of Open Access Journals (Sweden)

    Oscar Harari

    2010-07-01

    Full Text Available Transcriptional regulators recognize specific DNA sequences. Because these sequences are embedded in the background of genomic DNA, it is hard to identify the key cis-regulatory elements that determine disparate patterns of gene expression. The detection of the intra- and inter-species differences among these sequences is crucial for understanding the molecular basis of both differential gene expression and evolution. Here, we address this problem by investigating the target promoters controlled by the DNA-binding PhoP protein, which governs virulence and Mg(2+ homeostasis in several bacterial species. PhoP is particularly interesting; it is highly conserved in different gamma/enterobacteria, regulating not only ancestral genes but also governing the expression of dozens of horizontally acquired genes that differ from species to species. Our approach consists of decomposing the DNA binding site sequences for a given regulator into families of motifs (i.e., termed submotifs using a machine learning method inspired by the "Divide & Conquer" strategy. By partitioning a motif into sub-patterns, computational advantages for classification were produced, resulting in the discovery of new members of a regulon, and alleviating the problem of distinguishing functional sites in chromatin immunoprecipitation and DNA microarray genome-wide analysis. Moreover, we found that certain partitions were useful in revealing biological properties of binding site sequences, including modular gains and losses of PhoP binding sites through evolutionary turnover events, as well as conservation in distant species. The high conservation of PhoP submotifs within gamma/enterobacteria, as well as the regulatory protein that recognizes them, suggests that the major cause of divergence between related species is not due to the binding sites, as was previously suggested for other regulators. Instead, the divergence may be attributed to the fast evolution of orthologous target

  4. A Simple Decision Rule for Recognition of Poly(A) Tail Signal Motifs in Human Genome

    KAUST Repository

    AbouEisha, Hassan M.

    2015-05-12

    Background is the numerous attempts were made to predict motifs in genomic sequences that correspond to poly (A) tail signals. Vast portion of this effort has been directed to a plethora of nonlinear classification methods. Even when such approaches yield good discriminant results, identifying dominant features of regulatory mechanisms nevertheless remains a challenge. In this work, we look at decision rules that may help identifying such features. Findings are we present a simple decision rule for classification of candidate poly (A) tail signal motifs in human genomic sequence obtained by evaluating features during the construction of gradient boosted trees. We found that values of a single feature based on the frequency of adenine in the genomic sequence surrounding candidate signal and the number of consecutive adenine molecules in a well-defined region immediately following the motif displays good discriminative potential in classification of poly (A) tail motifs for samples covered by the rule. Conclusions is the resulting simple rule can be used as an efficient filter in construction of more complex poly(A) tail motifs classification algorithms.

  5. Nanomechanical DNA origami pH sensors.

    Science.gov (United States)

    Kuzuya, Akinori; Watanabe, Ryosuke; Yamanaka, Yusei; Tamaki, Takuya; Kaino, Masafumi; Ohya, Yuichi

    2014-10-16

    Single-molecule pH sensors have been developed by utilizing molecular imaging of pH-responsive shape transition of nanomechanical DNA origami devices with atomic force microscopy (AFM). Short DNA fragments that can form i-motifs were introduced to nanomechanical DNA origami devices with pliers-like shape (DNA Origami Pliers), which consist of two levers of 170-nm long and 20-nm wide connected at a Holliday-junction fulcrum. DNA Origami Pliers can be observed as in three distinct forms; cross, antiparallel and parallel forms, and cross form is the dominant species when no additional interaction is introduced to DNA Origami Pliers. Introduction of nine pairs of 12-mer sequence (5'-AACCCCAACCCC-3'), which dimerize into i-motif quadruplexes upon protonation of cytosine, drives transition of DNA Origami Pliers from open cross form into closed parallel form under acidic conditions. Such pH-dependent transition was clearly imaged on mica in molecular resolution by AFM, showing potential application of the system to single-molecular pH sensors.

  6. Feedback loops and reciprocal regulation: recurring motifs in the systems biology of the cell cycle

    OpenAIRE

    Ferrell, James E.

    2013-01-01

    The study of eukaryotic cell cycle regulation over the last several decades has led to a remarkably detailed understanding of the complex regulatory system that drives this fundamental process. This allows us to now look for recurring motifs in the regulatory system. Among these are negative feedback loops, which underpin checkpoints and generate cell cycle oscillations; positive feedback loops, which promote oscillations and make cell cycle transitions switch-like and unidirectional; and rec...

  7. Memetic algorithms for de novo motif-finding in biomedical sequences.

    Science.gov (United States)

    Bi, Chengpeng

    2012-09-01

    The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary micro

  8. Role of specific cations and water entropy on the stability of branched DNA motif structures.

    Science.gov (United States)

    Pascal, Tod A; Goddard, William A; Maiti, Prabal K; Vaidehi, Nagarajan

    2012-10-11

    DNA three-way junctions (TWJs) are important intermediates in various cellular processes and are the simplest of a family of branched nucleic acids being considered as scaffolds for biomolecular nanotechnology. Branched nucleic acids are stabilized by divalent cations such as Mg(2+), presumably due to condensation and neutralization of the negatively charged DNA backbone. However, electrostatic screening effects point to more complex solvation dynamics and a large role of interfacial waters in thermodynamic stability. Here, we report extensive computer simulations in explicit water and salt on a model TWJ and use free energy calculations to quantify the role of ionic character and strength on stability. We find that enthalpic stabilization of the first and second hydration shells by Mg(2+) accounts for 1/3 and all of the free energy gain in 50% and pure MgCl(2) solutions, respectively. The more distorted DNA molecule is actually destabilized in pure MgCl(2) compared to pure NaCl. Notably, the first shell, interfacial waters have very low translational and rotational entropy (i.e., mobility) compared to the bulk, an entropic loss that is overcompensated by increased enthalpy from additional electrostatic interactions with Mg(2+). In contrast, the second hydration shell has anomalously high entropy as it is trapped between an immobile and bulklike layer. The nonmonotonic entropic signature and long-range perturbations of the hydration shells to Mg(2+) may have implications in the molecular recognition of these motifs. For example, we find that low salt stabilizes the parallel configuration of the three-way junction, whereas at normal salt we find antiparallel configurations deduced from the NMR. We use the 2PT analysis to follow the thermodynamics of this transition and find that the free energy barrier is dominated by entropic effects that result from the decreased surface area of the antiparallel form which has a smaller number of low entropy waters in the first

  9. GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.

    Science.gov (United States)

    Zandevakili, Pooya; Hu, Ming; Qin, Zhaohui

    2012-01-01

    Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS) that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU)-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/

  10. GPUmotif: an ultra-fast and energy-efficient motif analysis program using graphics processing units.

    Directory of Open Access Journals (Sweden)

    Pooya Zandevakili

    Full Text Available Computational detection of TF binding patterns has become an indispensable tool in functional genomics research. With the rapid advance of new sequencing technologies, large amounts of protein-DNA interaction data have been produced. Analyzing this data can provide substantial insight into the mechanisms of transcriptional regulation. However, the massive amount of sequence data presents daunting challenges. In our previous work, we have developed a novel algorithm called Hybrid Motif Sampler (HMS that enables more scalable and accurate motif analysis. Despite much improvement, HMS is still time-consuming due to the requirement to calculate matching probabilities position-by-position. Using the NVIDIA CUDA toolkit, we developed a graphics processing unit (GPU-accelerated motif analysis program named GPUmotif. We proposed a "fragmentation" technique to hide data transfer time between memories. Performance comparison studies showed that commonly-used model-based motif scan and de novo motif finding procedures such as HMS can be dramatically accelerated when running GPUmotif on NVIDIA graphics cards. As a result, energy consumption can also be greatly reduced when running motif analysis using GPUmotif. The GPUmotif program is freely available at http://sourceforge.net/projects/gpumotif/

  11. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    Science.gov (United States)

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  12. Automatic compilation from high-level biologically-oriented programming language to genetic regulatory networks.

    Science.gov (United States)

    Beal, Jacob; Lu, Ting; Weiss, Ron

    2011-01-01

    The field of synthetic biology promises to revolutionize our ability to engineer biological systems, providing important benefits for a variety of applications. Recent advances in DNA synthesis and automated DNA assembly technologies suggest that it is now possible to construct synthetic systems of significant complexity. However, while a variety of novel genetic devices and small engineered gene networks have been successfully demonstrated, the regulatory complexity of synthetic systems that have been reported recently has somewhat plateaued due to a variety of factors, including the complexity of biology itself and the lag in our ability to design and optimize sophisticated biological circuitry. To address the gap between DNA synthesis and circuit design capabilities, we present a platform that enables synthetic biologists to express desired behavior using a convenient high-level biologically-oriented programming language, Proto. The high level specification is compiled, using a regulatory motif based mechanism, to a gene network, optimized, and then converted to a computational simulation for numerical verification. Through several example programs we illustrate the automated process of biological system design with our platform, and show that our compiler optimizations can yield significant reductions in the number of genes (~ 50%) and latency of the optimized engineered gene networks. Our platform provides a convenient and accessible tool for the automated design of sophisticated synthetic biological systems, bridging an important gap between DNA synthesis and circuit design capabilities. Our platform is user-friendly and features biologically relevant compiler optimizations, providing an important foundation for the development of sophisticated biological systems.

  13. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    Science.gov (United States)

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-02-20

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.

  14. The nitrogen responsive transcriptome in potato (Solanum tuberosum L.) reveals significant gene regulatory motifs.

    Science.gov (United States)

    Gálvez, José Héctor; Tai, Helen H; Lagüe, Martin; Zebarth, Bernie J; Strömvik, Martina V

    2016-05-19

    Nitrogen (N) is the most important nutrient for the growth of potato (Solanum tuberosum L.). Foliar gene expression in potato plants with and without N supplementation at 180 kg N ha(-1) was compared at mid-season. Genes with consistent differences in foliar expression due to N supplementation over three cultivars and two developmental time points were examined. In total, thirty genes were found to be over-expressed and nine genes were found to be under-expressed with supplemented N. Functional relationships between over-expressed genes were found. The main metabolic pathway represented among differentially expressed genes was amino acid metabolism. The 1000 bp upstream flanking regions of the differentially expressed genes were analysed and nine overrepresented motifs were found using three motif discovery algorithms (Seeder, Weeder and MEME). These results point to coordinated gene regulation at the transcriptional level controlling steady state potato responses to N sufficiency.

  15. A Novel Protein Interaction between Nucleotide Binding Domain of Hsp70 and p53 Motif

    Directory of Open Access Journals (Sweden)

    Asita Elengoe

    2015-01-01

    Full Text Available Currently, protein interaction of Homo sapiens nucleotide binding domain (NBD of heat shock 70 kDa protein (PDB: 1HJO with p53 motif remains to be elucidated. The NBD-p53 motif complex enhances the p53 stabilization, thereby increasing the tumor suppression activity in cancer treatment. Therefore, we identified the interaction between NBD and p53 using STRING version 9.1 program. Then, we modeled the three-dimensional structure of p53 motif through homology modeling and determined the binding affinity and stability of NBD-p53 motif complex structure via molecular docking and dynamics (MD simulation. Human DNA binding domain of p53 motif (SCMGGMNR retrieved from UniProt (UniProtKB: P04637 was docked with the NBD protein, using the Autodock version 4.2 program. The binding energy and intermolecular energy for the NBD-p53 motif complex were −0.44 Kcal/mol and −9.90 Kcal/mol, respectively. Moreover, RMSD, RMSF, hydrogen bonds, salt bridge, and secondary structure analyses revealed that the NBD protein had a strong bond with p53 motif and the protein-ligand complex was stable. Thus, the current data would be highly encouraging for designing Hsp70 structure based drug in cancer therapy.

  16. Structure of the Cpf1 endonuclease R-loop complex after target DNA cleavage

    DEFF Research Database (Denmark)

    Stella, Stefano; Alcón, Pablo; Montoya, Guillermo

    2017-01-01

    involved in DNA unwinding to form a CRISPR RNA (crRNA)-DNA hybrid and a displaced DNA strand. The protospacer adjacent motif (PAM) is recognized by the PAM-interacting domain. The loop-lysine helix-loop motif in this domain contains three conserved lysine residues that are inserted in a dentate manner...... and the crRNA-DNA hybrid, avoiding DNA re-annealing. Mutations in key residues reveal a mechanism linking the PAM and DNA nuclease sites. Analysis of the Cpf1 structures proposes a singular working model of RNA-guided DNA cleavage, suggesting new avenues for redesign of Cpf1....

  17. Nanomechanical DNA Origami pH Sensors

    Directory of Open Access Journals (Sweden)

    Akinori Kuzuya

    2014-10-01

    Full Text Available Single-molecule pH sensors have been developed by utilizing molecular imaging of pH-responsive shape transition of nanomechanical DNA origami devices with atomic force microscopy (AFM. Short DNA fragments that can form i-motifs were introduced to nanomechanical DNA origami devices with pliers-like shape (DNA Origami Pliers, which consist of two levers of 170-nm long and 20-nm wide connected at a Holliday-junction fulcrum. DNA Origami Pliers can be observed as in three distinct forms; cross, antiparallel and parallel forms, and cross form is the dominant species when no additional interaction is introduced to DNA Origami Pliers. Introduction of nine pairs of 12-mer sequence (5'-AACCCCAACCCC-3', which dimerize into i-motif quadruplexes upon protonation of cytosine, drives transition of DNA Origami Pliers from open cross form into closed parallel form under acidic conditions. Such pH-dependent transition was clearly imaged on mica in molecular resolution by AFM, showing potential application of the system to single-molecular pH sensors.

  18. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    Science.gov (United States)

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  19. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    Directory of Open Access Journals (Sweden)

    Kacy L Gordon

    2015-05-01

    Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  20. Interactions between the R2R3-MYB transcription factor, AtMYB61, and target DNA binding sites.

    Directory of Open Access Journals (Sweden)

    Michael B Prouse

    Full Text Available Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing. The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators.

  1. Srs2 mediates PCNA-SUMO-dependent inhibition of DNA repair synthesis

    International Nuclear Information System (INIS)

    Burkovics, Peter; Sebesta, Marek; Kolesar, Peter; Sisakova, Alexandra; Marini, Victoria; Plault, Nicolas; Szukacsov, Valeria; Pinter, Lajos; Haracska, Lajos; Robert, Thomas; Kolesar, Peter; Gangloff, Serge; Krejci, Lumir

    2013-01-01

    Completion of DNA replication needs to be ensured even when challenged with fork progression problems or DNA damage. PCNA and its modifications constitute a molecular switch to control distinct repair pathways. In yeast, SUMOylated PCNA (S-PCNA) recruits Srs2 to sites of replication where Srs2 can disrupt Rad51 filaments and prevent homologous recombination (HR). We report here an unexpected additional mechanism by which S-PCNA and Srs2 block the synthesis-dependent extension of a recombination intermediate, thus limiting its potentially hazardous resolution in association with a cross-over. This new Srs2 activity requires the SUMO interaction motif at its C-terminus, but neither its translocase activity nor its interaction with Rad51. Srs2 binding to S-PCNA dissociates Polδ and Polη from the repair synthesis machinery, thus revealing a novel regulatory mechanism controlling spontaneous genome rearrangements. Our results suggest that cycling cells use the Siz1-dependent SUMOylation of PCNA to limit the extension of repair synthesis during template switch or HR and attenuate reciprocal DNA strand exchanges to maintain genome stability. (authors)

  2. Next-Generation Sequencing of Genomic DNA Fragments Bound to a Transcription Factor in Vitro Reveals Its Regulatory Potential

    Directory of Open Access Journals (Sweden)

    Yukio Kurihara

    2014-12-01

    Full Text Available Several transcription factors (TFs coordinate to regulate expression of specific genes at the transcriptional level. In Arabidopsis thaliana it is estimated that approximately 10% of all genes encode TFs or TF-like proteins. It is important to identify target genes that are directly regulated by TFs in order to understand the complete picture of a plant’s transcriptome profile. Here, we investigate the role of the LONG HYPOCOTYL5 (HY5 transcription factor that acts as a regulator of photomorphogenesis. We used an in vitro genomic DNA binding assay coupled with immunoprecipitation and next-generation sequencing (gDB-seq instead of the in vivo chromatin immunoprecipitation (ChIP-based methods. The results demonstrate that the HY5-binding motif predicted here was similar to the motif reported previously and that in vitro HY5-binding loci largely overlapped with the HY5-targeted candidate genes identified in previous ChIP-chip analysis. By combining these results with microarray analysis, we identified hundreds of HY5-binding genes that were differentially expressed in hy5. We also observed delayed induction of some transcripts of HY5-binding genes in hy5 mutants in response to blue-light exposure after dark treatment. Thus, an in vitro gDNA-binding assay coupled with sequencing is a convenient and powerful method to bridge the gap between identifying TF binding potential and establishing function.

  3. DNA regulatory motif selection based on support vector machine ...

    African Journals Online (AJOL)

    Administrator

    2011-10-19

    Oct 19, 2011 ... ... gene expression values of controls and i x i y. 1 i y = 1 i y = −. 1. 2. { , ,..., , } i i i im i g. x x. x y. = 1. 2. 1. 2. , ,..., ,. , ,..., k i i i im. x x x. x x x x x. = =.

  4. Biophysical characterization of the basic cluster in the transcription repression domain of human MeCP2 with AT-rich DNA.

    Science.gov (United States)

    Mushtaq, Ameeq Ul; Lee, Yejin; Hwang, Eunha; Bang, Jeong Kyu; Hong, Eunmi; Byun, Youngjoo; Song, Ji-Joon; Jeon, Young Ho

    2018-01-01

    MeCP2 is a chromatin associated protein which is highly expressed in brain and relevant with Rett syndrome (RTT). There are AT-hook motifs in MeCP2 which can bind with AT-rich DNA, suggesting a role in chromatin binding. Here, we report the identification and characterization of another AT-rich DNA binding motif (residues 295 to 313) from the C-terminal transcription repression domain of MeCP2 by nuclear magnetic resonance (NMR) and isothermal calorimetry (ITC). This motif shows a micromolar affinity to AT-rich DNA, and it binds to the minor groove of DNA like AT-hook motifs. Together with the previous studies, our results provide an insight into a critical role of this motif in chromatin structure and function. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Super-transient scaling in time-delay autonomous Boolean network motifs

    Energy Technology Data Exchange (ETDEWEB)

    D' Huys, Otti, E-mail: otti.dhuys@phy.duke.edu; Haynes, Nicholas D. [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Lohmann, Johannes [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Institut für Theoretische Physik, Technische Universität Berlin, Hardenbergstraße 36, 10623 Berlin (Germany); Gauthier, Daniel J. [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Department of Physics, The Ohio State University, Columbus, Ohio 43210 (United States)

    2016-09-15

    Autonomous Boolean networks are commonly used to model the dynamics of gene regulatory networks and allow for the prediction of stable dynamical attractors. However, most models do not account for time delays along the network links and noise, which are crucial features of real biological systems. Concentrating on two paradigmatic motifs, the toggle switch and the repressilator, we develop an experimental testbed that explicitly includes both inter-node time delays and noise using digital logic elements on field-programmable gate arrays. We observe transients that last millions to billions of characteristic time scales and scale exponentially with the amount of time delays between nodes, a phenomenon known as super-transient scaling. We develop a hybrid model that includes time delays along network links and allows for stochastic variation in the delays. Using this model, we explain the observed super-transient scaling of both motifs and recreate the experimentally measured transient distributions.

  6. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    Science.gov (United States)

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  7. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks.

    Science.gov (United States)

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A; Kellis, Manolis

    2012-07-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.

  8. The DNA-recognition mode shared by archaeal feast/famine-regulatory proteins revealed by the DNA-binding specificities of TvFL3, FL10, FL11 and Ss-LrpB

    Science.gov (United States)

    Yokoyama, Katsushi; Nogami, Hideki; Kabasawa, Mamiko; Ebihara, Sonomi; Shimowasa, Ai; Hashimoto, Keiko; Kawashima, Tsuyoshi; Ishijima, Sanae A.; Suzuki, Masashi

    2009-01-01

    The DNA-binding mode of archaeal feast/famine-regulatory proteins (FFRPs), i.e. paralogs of the Esherichia coli leucine-responsive regulatory protein (Lrp), was studied. Using the method of systematic evolution of ligands by exponential enrichment (SELEX), optimal DNA duplexes for interacting with TvFL3, FL10, FL11 and Ss-LrpB were identified as TACGA[AAT/ATT]TCGTA, GTTCGA[AAT/ATT]TCGAAC, CCGAAA[AAT/ATT]TTTCGG and TTGCAA[AAT/ATT]TTGCAA, respectively, all fitting into the form abcdeWWWedcba. Here W is A or T, and e.g. a and a are bases complementary to each other. Apparent equilibrium binding constants of the FFRPs and various DNA duplexes were determined, thereby confirming the DNA-binding specificities of the FFRPs. It is likely that these FFRPs recognize DNA in essentially the same way, since their DNA-binding specificities were all explained by the same pattern of relationship between amino-acid positions and base positions to form chemical interactions. As predicted from this relationship, when Gly36 of TvFL3 was replaced by Thr, the b base in the optimal DNA duplex changed from A to T, and, when Thr36 of FL10 was replaced by Ser, the b base changed from T to G/A. DNA-binding characteristics of other archaeal FFRPs, Ptr1, Ptr2, Ss-Lrp and LysM, are also consistent with the relationship. PMID:19468044

  9. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets.

    Science.gov (United States)

    Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon

    2012-01-01

    To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.

  10. A DNA-binding-site landscape and regulatory network analysis for NAC transcription factors in Arabidopsis thaliana

    DEFF Research Database (Denmark)

    Lindemose, Søren; Jensen, Michael Krogh; de Velde, Jan Van

    2014-01-01

    regulatory networks of 12 NAC transcription factors. Our data offer specific single-base resolution fingerprints for most TFs studied and indicate that NAC DNA-binding specificities might be predicted from their DNA-binding domain's sequence. The developed methodology, including the application......Target gene identification for transcription factors is a prerequisite for the systems wide understanding of organismal behaviour. NAM-ATAF1/2-CUC2 (NAC) transcription factors are amongst the largest transcription factor families in plants, yet limited data exist from unbiased approaches to resolve...... the DNA-binding preferences of individual members. Here, we present a TF-target gene identification workflow based on the integration of novel protein binding microarray data with gene expression and multi-species promoter sequence conservation to identify the DNA-binding specificities and the gene...

  11. Motif-role-fingerprints: the building-blocks of motifs, clustering-coefficients and transitivities in directed networks.

    Directory of Open Access Journals (Sweden)

    Mark D McDonnell

    Full Text Available Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are 'structural' (induced subgraphs and 'functional' (partial subgraphs. Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.

  12. DNA nanostructure-directed assembly of metal nanoparticle superlattices

    Science.gov (United States)

    Julin, Sofia; Nummelin, Sami; Kostiainen, Mauri A.; Linko, Veikko

    2018-05-01

    Structural DNA nanotechnology provides unique, well-controlled, versatile, and highly addressable motifs and templates for assembling materials at the nanoscale. These methods to build from the bottom-up using DNA as a construction material are based on programmable and fully predictable Watson-Crick base pairing. Researchers have adopted these techniques to an increasing extent for creating numerous DNA nanostructures for a variety of uses ranging from nanoelectronics to drug-delivery applications. Recently, an increasing effort has been put into attaching nanoparticles (the size range of 1-20 nm) to the accurate DNA motifs and into creating metallic nanostructures (typically 20-100 nm) using designer DNA nanoshapes as molds or stencils. By combining nanoparticles with the superior addressability of DNA-based scaffolds, it is possible to form well-ordered materials with intriguing and completely new optical, plasmonic, electronic, and magnetic properties. This focused review discusses the DNA structure-directed nanoparticle assemblies covering the wide range of different one-, two-, and three-dimensional systems.

  13. Spectrometric study of the folding process of i-motif-forming DNA sequences upstream of the c-kit transcription initiation site

    International Nuclear Information System (INIS)

    Bucek, Pavel; Gargallo, Raimundo; Kudrev, Andrei

    2010-01-01

    The c-kit oncogene shows a cytosine-rich DNA region upstream of the transcription initiation site which forms an i-motif structure at slightly acidic pH values (Bucek et al. ). In the present study, the pH-induced formation of i-motif - forming sequences 5'-CCC CTC CCT CGC GCC CGC CCG-3' (ckitC1, native), 5'-CCC TTC CCT TGT GCC CGC CCG-3' (ckitC2) and 5'-CCCTT CCC TTTTT CCC T CCC T-3' (ckitC3) was studied by spectroscopic techniques, such as UV molecular absorption and circular dichroism (CD), in tandem with two multivariate data analysis methods, the hard modelling-based matrix method and the soft modelling-based MCR-ALS approach. Use of the hard chemical modelling enabled us to propose the equilibrium model, which describes spectral changes as functions of solution acidity. Additionally, the intrinsic protonation constant, K in , and the cooperativity parameters, ω c , and ω a , were calculated from the fitting procedure of the coupled CD and molecular absorption spectra. In the case of ckitC2 and ckitC3, the hard model correctly reproduced the spectral variations observed experimentally. The results indicated that folding was accompanied by a cooperative process, i.e. the enhancement of protonated structure stability upon protonation. In contrast, unfolding was accompanied by an anticooperative process. Finally, folding of the native sequence, ckitC1, seemed to follow a more complex mechanism.

  14. The N-Terminus of the Floral Arabidopsis TGA Transcription Factor PERIANTHIA Mediates Redox-Sensitive DNA-Binding.

    Directory of Open Access Journals (Sweden)

    Nora Gutsche

    Full Text Available The Arabidopsis TGA transcription factor (TF PERIANTHIA (PAN regulates the formation of the floral organ primordia as revealed by the pan mutant forming an abnormal pentamerous arrangement of the outer three floral whorls. The Arabidopsis TGA bZIP TF family comprises 10 members, of which PAN and TGA9/10 control flower developmental processes and TGA1/2/5/6 participate in stress-responses. For the TGA1 protein it was shown that several cysteines can be redox-dependently modified. TGA proteins interact in the nucleus with land plant-specific glutaredoxins, which may alter their activities posttranslationally. Here, we investigated the DNA-binding of PAN to the AAGAAT motif under different redox-conditions. The AAGAAT motif is localized in the second intron of the floral homeotic regulator AGAMOUS (AG, which controls stamen and carpel development as well as floral determinacy. Whereas PAN protein binds to this regulatory cis-element under reducing conditions, the interaction is strongly reduced under oxidizing conditions in EMSA studies. The redox-sensitive DNA-binding is mediated via a special PAN N-terminus, which is not present in other Arabidopsis TGA TFs and comprises five cysteines. Two N-terminal PAN cysteines, Cys68 and Cys87, were shown to form a disulfide bridge and Cys340, localized in a C-terminal putative transactivation domain, can be S-glutathionylated. Comparative land plant analyses revealed that the AAGAAT motif exists in asterid and rosid plant species. TGA TFs with N-terminal extensions of variable length were identified in all analyzed seed plants. However, a PAN-like N-terminus exists only in the rosids and exclusively Brassicaceae homologs comprise four to five of the PAN N-terminal cysteines. Redox-dependent modifications of TGA cysteines are known to regulate the activity of stress-related TGA TFs. Here, we show that the N-terminal PAN cysteines participate in a redox-dependent control of the PAN interaction with a highly

  15. Evidence for roles of the Escherichia coli Hda protein beyond regulatory inactivation of DnaA.

    Science.gov (United States)

    Baxter, Jamie C; Sutton, Mark D

    2012-08-01

    The ATP-bound form of the Escherichia coli DnaA protein binds 'DnaA boxes' present in the origin of replication (oriC) and operator sites of several genes, including dnaA, to co-ordinate their transcription with initiation of replication. The Hda protein, together with the β sliding clamp, stimulates the ATPase activity of DnaA via a process termed regulatory inactivation of DnaA (RIDA), to regulate the activity of DnaA in DNA replication. Here, we used the mutant dnaN159 strain, which expresses the β159 clamp protein, to gain insight into how the actions of Hda are co-ordinated with replication. Elevated expression of Hda impeded growth of the dnaN159 strain in a Pol II- and Pol IV-dependent manner, suggesting a role for Hda managing the actions of these Pols. In a wild-type strain, elevated levels of Hda conferred sensitivity to nitrofurazone, and suppressed the frequency of -1 frameshift mutations characteristic of Pol IV, while loss of hda conferred cold sensitivity. Using the dnaN159 strain, we identified 24 novel hda alleles, four of which supported E. coli viability despite their RIDA defect. Taken together, these findings suggest that although one or more Hda functions are essential for cell viability, RIDA may be dispensable. © 2012 Blackwell Publishing Ltd.

  16. Metamotifs - a generative model for building families of nucleotide position weight matrices

    Directory of Open Access Journals (Sweden)

    Down Thomas A

    2010-06-01

    Full Text Available Abstract Background Development of high-throughput methods for measuring DNA interactions of transcription factors together with computational advances in short motif inference algorithms is expanding our understanding of transcription factor binding site motifs. The consequential growth of sequence motif data sets makes it important to systematically group and categorise regulatory motifs. It has been shown that there are familial tendencies in DNA sequence motifs that are predictive of the family of factors that binds them. Further development of methods that detect and describe familial motif trends has the potential to help in measuring the similarity of novel computational motif predictions to previously known data and sensitively detecting regulatory motifs similar to previously known ones from novel sequence. Results We propose a probabilistic model for position weight matrix (PWM sequence motif families. The model, which we call the 'metamotif' describes recurring familial patterns in a set of motifs. The metamotif framework models variation within a family of sequence motifs. It allows for simultaneous estimation of a series of independent metamotifs from input position weight matrix (PWM motif data and does not assume that all input motif columns contribute to a familial pattern. We describe an algorithm for inferring metamotifs from weight matrix data. We then demonstrate the use of the model in two practical tasks: in the Bayesian NestedMICA model inference algorithm as a PWM prior to enhance motif inference sensitivity, and in a motif classification task where motifs are labelled according to their interacting DNA binding domain. Conclusions We show that metamotifs can be used as PWM priors in the NestedMICA motif inference algorithm to dramatically increase the sensitivity to infer motifs. Metamotifs were also successfully applied to a motif classification problem where sequence motif features were used to predict the family of

  17. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease.

    Science.gov (United States)

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia

    2018-01-04

    Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Binary self-assembly of highly symmetric DNA nanocages via sticky-end engineering

    Institute of Scientific and Technical Information of China (English)

    Xiao-Rong Wu; Chen-Wei Wu; Fei Ding; Cheng Tian; Wen Jiang; Cheng-De Mao; Chuan Zhang

    2017-01-01

    Discrete and symmetric three-dimensional (3D) DNA nanocages have been revoked as excellent candidates for various applications,such as guest component encapsulation and organization (e.g.dye molecules,proteins,inorganic nanoparticles,etc.) to construct new materials and devices.To date,a large variety of DNA nanocages has been synthesized through assembling small individual DNA motifs into predesigned structures in a bottom-up fashion.Most of them rely on the assembly using multiple copies of single type of motifs and a few sophisticated nanostructures have been engineered by co-assembling multi-types of DNA tiles simultaneously.However,the availability of complex DNA nanocages is still limited.Herein,we demonstrate that highly symmetric DNA nanocages consisted of binary DNA pointstar motifs can be easily assembled by deliberately engineering the sticky-end interaction between the component building blocks.As such,DNA nanocages with new geometries,including elongated tetrahedron (E-TET),rhombic dodecahedron (R-DOD),and rhombic triacontahedron (R-TRI) are successfully synthesized.Moreover,their design principle,assembly process,and structural features are revealed by polyacryalmide gel electrophoresis (PAGE),atomic force microscope (AFM) imaging,and cryogenic transmission electron microscope imaging (cryo-TEM) associated with single particle reconstruction.

  19. Structural Diversity in Conserved Regions Like the DRY-Motif among Viral 7TM Receptors-A Consequence of Evolutionary Pressure?

    DEFF Research Database (Denmark)

    Mølleskov-Jensen, Ann-Sofie; Sparre-Ulrich, Alexander Hovard; Davis-Poynter, Nicholas

    2012-01-01

    Several herpes- and poxviruses have captured chemokine receptors from their hosts and modified these to their own benefit. The human and viral chemokine receptors belong to class A 7 transmembrane (TM) receptors which are characterized by several structural motifs like the DRY-motif in TM3...... and the C-terminal tail. In the DRY-motif, the arginine residue serves important purposes by being directly involved in G protein coupling. Interestingly, among the viral receptors there is a greater diversity in the DRY-motif compared to their endogenous receptor homologous. The C-terminal receptor tail...... constitutes another regulatory region that through a number of phosphorylation sites is involved in signaling, desensitization, and internalization. Also this region is more variable among virus-encoded 7TM receptors compared to human class A receptors. In this review we will focus on these two structural...

  20. Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

    Science.gov (United States)

    Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

    2013-01-01

    We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.

  1. A minimal murine Msx-1 gene promoter. Organization of its cis-regulatory motifs and their role in transcriptional activation in cells in culture and in transgenic mice.

    Science.gov (United States)

    Takahashi, T; Guron, C; Shetty, S; Matsui, H; Raghow, R

    1997-09-05

    To dissect the cis-regulatory elements of the murine Msx-1 promoter, which lacks a conventional TATA element, a putative Msx-1 promoter DNA fragment (from -1282 to +106 base pairs (bp)) or its congeners containing site-specific alterations were fused to luciferase reporter and introduced into NIH3T3 and C2C12 cells, and the expression of luciferase was assessed in transient expression assays. The functional consequences of the sequential 5' deletions of the promotor revealed that multiple positive and negative regulatory elements participate in regulating transcription of the Msx-1 gene. Surprisingly, however, the optimal expression of Msx-1 promoter in either NIH3T3 or C2C12 cells required only 165 bp of the upstream sequence to warrant detailed examination of its structure. Therefore, the functional consequences of site-specific deletions and point mutations of the cis-acting elements of the minimal Msx-1 promoter were systematically examined. Concomitantly, potential transcriptional factor(s) interacting with the cis-acting elements of the minimal promoter were also studied by gel electrophoretic mobility shift assays and DNase I footprinting. Combined analyses of the minimal promoter by DNase I footprinting, electrophoretic mobility shift assays, and super shift assays with specific antibodies revealed that 5'-flanking regions from -161 to -154 and from -26 to -13 of the Msx-1 promoter contains an authentic E box (proximal E box), capable of binding a protein immunologically related to the upstream stimulating factor 1 (USF-1) and a GC-rich sequence motif which can bind to Sp1 (proximal Sp1), respectively. Additionally, we observed that the promoter activation was seriously hampered if the proximal E box was removed or mutated, and the promoter activity was eliminated completely if the proximal Sp1 site was similarly altered. Absolute dependence of the Msx-1 minimal promoter on Sp1 could be demonstrated by transient expression assays in the Sp1-deficient

  2. Disparate requirements for the Walker A and B ATPase motifs ofhuman RAD51D in homologous recombination

    Energy Technology Data Exchange (ETDEWEB)

    Wiese, Claudia; Hinz, John M.; Tebbs, Robert S.; Nham, Peter B.; Urbin, Salustra S.; Collins, David W.; Thompson, Larry H.; Schild, David

    2006-04-21

    In vertebrates, homologous recombinational repair (HRR) requires RAD51 and five RAD51 paralogs (XRCC2, XRCC3, RAD51B, RAD51C, and RAD51D) that all contain conserved Walker A and B ATPase motifs. In human RAD51D we examined the requirement for these motifs in interactions with XRCC2 and RAD51C, and for survival of cells in response to DNA interstrand crosslinks. Ectopic expression of wild type human RAD51D or mutants having a non-functional A or B motif was used to test for complementation of a rad51d knockout hamster CHO cell line. Although A-motif mutants complement very efficiently, B-motif mutants do not. Consistent with these results, experiments using the yeast two- and three-hybrid systems show that the interactions between RAD51D and its XRCC2 and RAD51C partners also require a functional RAD51D B motif, but not motif A. Similarly, hamster Xrcc2 is unable to bind to the non-complementing human RAD51D B-motif mutants in co-immunoprecipitation assays. We conclude that a functional Walker B motif, but not A motif, is necessary for RAD51D's interactions with other paralogs and for efficient HRR. We present a model in which ATPase sites are formed in a bipartite manner between RAD51D and other RAD51 paralogs.

  3. Assessment of algorithms for inferring positional weight matrix motifs of transcription factor binding sites using protein binding microarray data.

    Directory of Open Access Journals (Sweden)

    Yaron Orenstein

    Full Text Available The new technology of protein binding microarrays (PBMs allows simultaneous measurement of the binding intensities of a transcription factor to tens of thousands of synthetic double-stranded DNA probes, covering all possible 10-mers. A key computational challenge is inferring the binding motif from these data. We present a systematic comparison of four methods developed specifically for reconstructing a binding site motif represented as a positional weight matrix from PBM data. The reconstructed motifs were evaluated in terms of three criteria: concordance with reference motifs from the literature and ability to predict in vivo and in vitro bindings. The evaluation encompassed over 200 transcription factors and some 300 assays. The results show a tradeoff between how the methods perform according to the different criteria, and a dichotomy of method types. Algorithms that construct motifs with low information content predict PBM probe ranking more faithfully, while methods that produce highly informative motifs match reference motifs better. Interestingly, in predicting high-affinity binding, all methods give far poorer results for in vivo assays compared to in vitro assays.

  4. Nucleotide-mimetic synthetic ligands for DNA-recognizing enzymes One-step purification of Pfu DNA polymerase.

    Science.gov (United States)

    Melissis, S; Labrou, N E; Clonis, Y D

    2006-07-28

    The commercial availability of DNA polymerases has revolutionized molecular biotechnology and certain sectors of the bio-industry. Therefore, the development of affinity adsorbents for purification of DNA polymerases is of academic interest and practical importance. In the present study we describe the design, synthesis and evaluation of a combinatorial library of novel affinity ligands for the purification of DNA polymerases (Pols). Pyrococcus furiosus DNA polymerase (Pfu Pol) was employed as a proof-of-principle example. Affinity ligand design was based on mimicking the natural interactions between deoxynucleoside-triphosphates (dNTPs) and the B-motif, a conserved structural moiety found in Pol-I and Pol-II family of enzymes. Solid-phase 'structure-guided' combinatorial chemistry was used to construct a library of 26 variants of the B-motif-binding 'lead' ligand X-Trz-Y (X is a purine derivative and Y is an aliphatic/aromatic sulphonate or phosphonate derivative) using 1,3,5-triazine (Trz) as the scaffold for assembly. The 'lead' ligand showed complementarity against a Lys and a Tyr residue of the polymerase B-motif. The ligand library was screened for its ability to bind and purify Pfu Pol from Escherichia coli extract. One immobilized ligand (oABSAd), bearing 9-aminoethyladenine (AEAd) and sulfanilic acid (oABS) linked on the triazine scaffold, displayed the highest purifying ability and binding capacity (0,55 mg Pfu Pol/g wet gel). Adsorption equilibrium studies with this affinity ligand and Pfu Pol determined a dissociation constant (K(D)) of 83 nM for the respective complex. The oABSAd affinity adsorbent was exploited in the development of a facile Pfu Pol purification protocol, affording homogeneous enzyme (>99% purity) in a single chromatography step. Quality control tests showed that Pfu Pol purified on the B-motif-complementing ligand is free of nucleic acids and contaminating nuclease activities, therefore, suitable for experimental use.

  5. Transcriptional control of the tissue-specific, developmentally regulated osteocalcin gene requires a binding motif for the Msx family of homeodomain proteins.

    Science.gov (United States)

    Hoffmann, H M; Catron, K M; van Wijnen, A J; McCabe, L R; Lian, J B; Stein, G S; Stein, J L

    1994-12-20

    The OC box of the rat osteocalcin promoter (nt -99 to -76) is the principal proximal regulatory element contributing to both tissue-specific and developmental control of osteocalcin gene expression. The central motif of the OC box includes a perfect consensus DNA binding site for certain homeodomain proteins. Homeodomain proteins are transcription factors that direct proper development by regulating specific temporal and spatial patterns of gene expression. We therefore addressed the role of the homeodomain binding motif in the activity of the OC promoter. In this study, by the combined application of mutagenesis and site-specific protein recognition analysis, we examined interactions of ROS 17/2.8 osteosarcoma cell nuclear proteins and purified Msx-1 homeodomain protein with the OC box. We detected a series of related specific protein-DNA interactions, a subset of which were inhibited by antibodies directed against the Msx-1 homeodomain but which also recognize the Msx-2 homeodomain. Our results show that the sequence requirements for binding the Msx-1 or Msx-2 homeodomain closely parallel those necessary for osteocalcin gene promoter activity in vivo. This functional relationship was demonstrated by transient expression in ROS 17/2.8 osteosarcoma cells of a series of osteocalcin promoter (nt -1097 to +24)-reporter gene constructs containing mutations within and flanking the homeodomain binding site of the OC box. Northern blot analysis of several bone-related cell types showed that all of the cells expressed msx-1, whereas msx-2 expression was restricted to cells transcribing osteocalcin. Taken together, our results suggest a role for Msx-1 and -2 or related homeodomain proteins in transcription of the osteocalcin gene.

  6. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

    Science.gov (United States)

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-12-01

    The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  7. Efficient sequential and parallel algorithms for planted motif search.

    Science.gov (United States)

    Nicolae, Marius; Rajasekaran, Sanguthevar

    2014-01-31

    Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances. This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/. We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.

  8. Target motifs affecting natural immunity by a constitutive CRISPR-Cas system in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Cristóbal Almendros

    Full Text Available Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR and CRISPR associated (cas genes conform the CRISPR-Cas systems of various bacteria and archaea and produce degradation of invading nucleic acids containing sequences (protospacers that are complementary to repeat intervening spacers. It has been demonstrated that the base sequence identity of a protospacer with the cognate spacer and the presence of a protospacer adjacent motif (PAM influence CRISPR-mediated interference efficiency. By using an original transformation assay with plasmids targeted by a resident spacer here we show that natural CRISPR-mediated immunity against invading DNA occurs in wild type Escherichia coli. Unexpectedly, the strongest activity is observed with protospacer adjoining nucleotides (interference motifs that differ from the PAM both in sequence and location. Hence, our results document for the first time native CRISPR activity in E. coli and demonstrate that positions next to the PAM in invading DNA influence their recognition and degradation by these prokaryotic immune systems.

  9. A role for circadian evening elements in cold-regulated gene expression in Arabidopsis.

    Science.gov (United States)

    Mikkelsen, Michael D; Thomashow, Michael F

    2009-10-01

    The plant transcriptome is dramatically altered in response to low temperature. The cis-acting DNA regulatory elements and trans-acting factors that regulate the majority of cold-regulated genes are unknown. Previous bioinformatic analysis has indicated that the promoters of cold-induced genes are enriched in the Evening Element (EE), AAAATATCT, a DNA regulatory element that has a role in circadian-regulated gene expression. Here we tested the role of EE and EE-like (EEL) elements in cold-induced expression of two Arabidopsis genes, CONSTANS-like 1 (COL1; At5g54470) and a gene encoding a 27-kDa protein of unknown function that we designated COLD-REGULATED GENE 27 (COR27; At5g42900). Mutational analysis indicated that the EE/EEL elements were required for cold induction of COL1 and COR27, and that their action was amplified through coupling with ABA response element (ABRE)-like (ABREL) motifs. An artificial promoter consisting solely of four EE motifs interspersed with three ABREL motifs was sufficient to impart cold-induced gene expression. Both COL1 and COR27 were found to be regulated by the circadian clock at warm growth temperatures and cold-induction of COR27 was gated by the clock. These results suggest that cold- and clock-regulated gene expression are integrated through regulatory proteins that bind to EE and EEL elements supported by transcription factors acting at ABREL sequences. Bioinformatic analysis indicated that the coupling of EE and EEL motifs with ABREL motifs is highly enriched in cold-induced genes and thus may constitute a DNA regulatory element pair with a significant role in configuring the low-temperature transcriptome.

  10. KIRMES: kernel-based identification of regulatory modules in euchromatic sequences.

    Science.gov (United States)

    Schultheiss, Sebastian J; Busch, Wolfgang; Lohmann, Jan U; Kohlbacher, Oliver; Rätsch, Gunnar

    2009-08-15

    Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules. We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets. The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at http://www.fml.mpg.de/raetsch/suppl/kirmes/.

  11. PlantCARE, a plant cis-acting regulatory element database

    OpenAIRE

    Rombauts, Stephane; Déhais, Patrice; Van Montagu, Marc; Rouzé, Pierre

    1999-01-01

    PlantCARE is a database of plant cis- acting regulatory elements, enhancers and repressors. Besides the transcription motifs found on a sequence, it also offers a link to the EMBL entry that contains the full gene sequence as well as a description of the conditions in which a motif becomes functional. The information on these sites is given by matrices, consensus and individual site sequences on particular genes, depending on the available information. PlantCARE is a relational database avail...

  12. YMDD motif mutations in chronic hepatitis B antiviral treatment naïve patients: a multi-center study

    Directory of Open Access Journals (Sweden)

    You-Wen Tan

    Full Text Available OBJECTIVE: This study aimed to determine the natural prevalence of variants of tyrosine-methionine-aspartic acid-aspartic acid (YMDD motif in patients with chronic hepatitis B (CHB, and to explore its relation with demographic and clinical features, hepatitis B virus (HBV genotypes, and HBV DNA levels. METHODS: A total of 1,042 antiviral treatment naïve CHB patients (including with lamivudine [LAM] in the past year were recruited from outpatient and inpatient departments of six centers from December 2008 to June 2010. YMDD variants were analyzed using the HBV drug resistance line probe assay (Inno-Lipa HBV-DR. HBV genotypes were detected with polymerase chain reaction (PCR microcosmic nucleic acid cross-ELISA, and HBV deoxyribonucleic acid (DNA was quantitated with real-time PCR. All serum samples underwent tests for HBV, HCV, and HDV with ELISA. RESULTS: YMDD variants were detected in 23.3% (243/1042 of CHB patients. YMDD mutation was accompanied by L180M mutation in 154 (76.9% patients. Both wild-type HBV and YMDD variant HBV were present in 231 of 243 patients. Interestingly, 12 patients had only YIDD and/or YVDD variants without wild YMDD motif. In addition, 27.2% (98/359 of HbeAg-positive patients had YMDD mutations, which was higher than that in HbeAg-negative patients (21.2%, 145/683. The incidence of YMDD varied among patients with different HBV genotypes, but the difference was not significant. Moreover, the incidence of YMDD in patients with high HBV DNA level was significantly higher than that in those with low HBV DNA level. CONCLUSION: Mutation of YMDD motif was detectable at a high rate in CHB patients in this study. The incidence of YMDD may be correlated with HBeAg and HBV DNA level.

  13. Structure-based domain assignment in Leishmania infantum EndoG: characterization of a pH-dependent regulatory switch and a C-terminal extension that largely dictates DNA substrate preferences.

    Science.gov (United States)

    Oliva, Cristina; Sánchez-Murcia, Pedro A; Rico, Eva; Bravo, Ana; Menéndez, Margarita; Gago, Federico; Jiménez-Ruiz, Antonio

    2017-09-06

    Mitochondrial endonuclease G from Leishmania infantum (LiEndoG) participates in the degradation of double-stranded DNA (dsDNA) during parasite cell death and is catalytically inactive at a pH of 8.0 or above. The presence, in the primary sequence, of an acidic amino acid-rich insertion exclusive to trypanosomatids and its spatial position in a homology-built model of LiEndoG led us to postulate that this peptide stretch might act as a pH sensor for self-inhibition. We found that a LiEndoG variant lacking residues 145-180 is indeed far more active than its wild-type counterpart at pH values >7.0. In addition, we discovered that (i) LiEndoG exists as a homodimer, (ii) replacement of Ser211 in the active-site SRGH motif with the canonical aspartate from the DRGH motif of other nucleases leads to a catalytically deficient enzyme, (iii) the activity of the S211D variant can be restored upon the concomitant replacement of Ala247 with Arg and (iv) a C-terminal extension is responsible for the observed preferential cleavage of single-stranded DNA (ssDNA) and ssDNA-dsDNA junctions. Taken together, our results support the view that LiEndoG is a multidomain molecular machine whose nuclease activity can be subtly modulated or even abrogated through architectural changes brought about by environmental conditions and interaction with other binding partners. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. RecO protein initiates DNA recombination and strand annealing through two alternative DNA binding mechanisms.

    Science.gov (United States)

    Ryzhikov, Mikhail; Gupta, Richa; Glickman, Michael; Korolev, Sergey

    2014-10-17

    Recombination mediator proteins (RMPs) are important for genome stability in all organisms. Several RMPs support two alternative reactions: initiation of homologous recombination and DNA annealing. We examined mechanisms of RMPs in both reactions with Mycobacterium smegmatis RecO (MsRecO) and demonstrated that MsRecO interacts with ssDNA by two distinct mechanisms. Zinc stimulates MsRecO binding to ssDNA during annealing, whereas the recombination function is zinc-independent and is regulated by interaction with MsRecR. Thus, different structural motifs or conformations of MsRecO are responsible for interaction with ssDNA during annealing and recombination. Neither annealing nor recombinase loading depends on MsRecO interaction with the conserved C-terminal tail of single-stranded (ss) DNA-binding protein (SSB), which is known to bind Escherichia coli RecO. However, similarly to E. coli proteins, MsRecO and MsRecOR do not dismiss SSB from ssDNA, suggesting that RMPs form a complex with SSB-ssDNA even in the absence of binding to the major protein interaction motif. We propose that alternative conformations of such complexes define the mechanism by which RMPs initiate the repair of stalled replication and support two different functions during recombinational repair of DNA breaks. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  15. Identification of the Raptor-binding motif on Arabidopsis S6 kinase and its use as a TOR signaling suppressor.

    Science.gov (United States)

    Son, Ora; Kim, Sunghan; Hur, Yoon-Sun; Cheon, Choong-Ill

    2016-03-25

    TOR (target of rapamycin) kinase signaling plays central role as a regulator of growth and proliferation in all eukaryotic cells and its key signaling components and effectors are also conserved in plants. Unlike the mammalian and yeast counterparts, however, we found through yeast two-hybrid analysis that multiple regions of the Arabidopsis Raptor (regulatory associated protein of TOR) are required for binding to its substrate. We also identified that a 44-amino acid region at the N-terminal end of Arabidopsis ribosomal S6 kinase 1 (AtS6K1) specifically interacted with AtRaptor1, indicating that this region may contain a functional equivalent of the TOS (TOR-Signaling) motif present in the mammalian TOR substrates. Transient over-expression of this 44-amino acid fragment in Arabidopsis protoplasts resulted in significant decrease in rDNA transcription, demonstrating a feasibility of developing a new plant-specific TOR signaling inhibitor based upon perturbation of the Raptor-substrate interaction. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Disparate requirements for the Walker A and B ATPase motifs of human RAD51D in homologous recombination.

    Science.gov (United States)

    Wiese, Claudia; Hinz, John M; Tebbs, Robert S; Nham, Peter B; Urbin, Salustra S; Collins, David W; Thompson, Larry H; Schild, David

    2006-01-01

    In vertebrates, homologous recombinational repair (HRR) requires RAD51 and five RAD51 paralogs (XRCC2, XRCC3, RAD51B, RAD51C and RAD51D) that all contain conserved Walker A and B ATPase motifs. In human RAD51D we examined the requirement for these motifs in interactions with XRCC2 and RAD51C, and for survival of cells in response to DNA interstrand crosslinks (ICLs). Ectopic expression of wild-type human RAD51D or mutants having a non-functional A or B motif was used to test for complementation of a rad51d knockout hamster CHO cell line. Although A-motif mutants complement very efficiently, B-motif mutants do not. Consistent with these results, experiments using the yeast two- and three-hybrid systems show that the interactions between RAD51D and its XRCC2 and RAD51C partners also require a functional RAD51D B motif, but not motif A. Similarly, hamster Xrcc2 is unable to bind to the non-complementing human RAD51D B-motif mutants in co-immunoprecipitation assays. We conclude that a functional Walker B motif, but not A motif, is necessary for RAD51D's interactions with other paralogs and for efficient HRR. We present a model in which ATPase sites are formed in a bipartite manner between RAD51D and other RAD51 paralogs.

  17. TFII-I regulates target genes in the PI-3K and TGF-β signaling pathways through a novel DNA binding motif.

    Science.gov (United States)

    Segura-Puimedon, Maria; Borralleras, Cristina; Pérez-Jurado, Luis A; Campuzano, Victoria

    2013-09-25

    General transcription factor (TFII-I) is a multi-functional protein involved in the transcriptional regulation of critical developmental genes, encoded by the GTF2I gene located on chromosome 7q11.23. Haploinsufficiency at GTF2I has been shown to play a major role in the neurodevelopmental features of Williams-Beuren syndrome (WBS). Identification of genes regulated by TFII-I is thus critical to detect molecular determinants of WBS as well as to identify potential new targets for specific pharmacological interventions, which are currently absent. We performed a microarray screening for transcriptional targets of TFII-I in cortex and embryonic cells from Gtf2i mutant and wild-type mice. Candidate genes with altered expression were verified using real-time PCR. A novel motif shared by deregulated genes was found and chromatin immunoprecipitation assays in embryonic fibroblasts were used to document in vitro TFII-I binding to this motif in the promoter regions of deregulated genes. Interestingly, the PI3K and TGFβ signaling pathways were over-represented among TFII-I-modulated genes. In this study we have found a highly conserved DNA element, common to a set of genes regulated by TFII-I, and identified and validated novel in vivo neuronal targets of this protein affecting the PI3K and TGFβ signaling pathways. Overall, our data further contribute to unravel the complexity and variability of the different genetic programs orchestrated by TFII-I. © 2013 Elsevier B.V. All rights reserved.

  18. Isolation and characterisation of the cDNA encoding a glycosylated accessory protein of pea chloroplast DNA polymerase.

    OpenAIRE

    Gaikwad, A; Tewari, K K; Kumar, D; Chen, W; Mukherjee, S K

    1999-01-01

    The cDNA encoding p43, a DNA binding protein from pea chloroplasts (ct) that binds to cognate DNA polymerase and stimulates the polymerase activity, has been cloned and characterised. The characteristic sequence motifs of hydroxyproline-rich glyco-proteins (HRGP) are present in the cDNA corres-ponding to the N-terminal domain of the mature p43. The protein was found to be highly O-arabinosylated. Chemically deglycosylated p43 (i.e. p29) retains its binding to both DNA and pea ct-DNA polymeras...

  19. Crystal Structure of a Putative HTH-Type Transcriptional Regulator yxaF from Bacillus subtilis

    International Nuclear Information System (INIS)

    Seetharaman, J.; Kumaran, D.; Bonanno, J.; Burley, S.; Swaminathan, S.

    2006-01-01

    The New York Structural GenomiX Research Consortium (NYSGXRC) has selected the protein coded by yxaF gene from Bacillus subtilis as a target for structure determination. The yxaF protein has 191 residues with a molecular mass of 21 kDa and had no sequence homology to any structure in the Protein Data Bank (PDB) at the time of target selection. We aimed to elucidate the three-dimensional structure for the putative protein yxaF to better understand the relationship between protein sequence, structure, and function. This protein is annotated as a putative helix-turn-helix (HTH) type transcriptional regulator. Many transcriptional regulators like TetR and QacR use a structurally well-defined DNA-binding HTH motif to recognize the target DNA sequences. DNA-HTH motif interactions have been extensively studied. As the HTH motif is structurally conserved in many regulatory proteins, these DNA-protein complexes show some similarity in DNA recognition patterns. Many such regulatory proteins have a ligand-binding domain in addition to the DNA-binding domain. Structural studies on ligand-binding regulatory proteins provide a wealth of information on ligand-, and possibly drug-, binding mechanisms. Understanding the ligand-binding mechanism may help overcome problems with drug resistance, which represent increasing challenges in medicine. The protein encoded by yxaF, hereafter called T1414, shows fold similar to QacR repressor and TetR/CamR repressor and possesses putative DNA and ligand-binding domains. Here, we report the crystal structure of T1414 and compare it with structurally similar drug and DNA-binding proteins

  20. A regenerated electrochemical biosensor for label-free detection of glucose and urea based on conformational switch of i-motif oligonucleotide probe

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Zhong Feng; Chen, Dong Mei [Key Laboratory of Eco-environments in Three Gorges Reservoir Region (Ministry of Education), School of Chemistry and Chemical Engineering, Southwest University, Chongqing 400715 (China); Lei, Jing Lei [School of Chemistry and Chemical Engineering, Chongqing University, Chongqing 400044 (China); Luo, Hong Qun, E-mail: luohq@swu.edu.cn [Key Laboratory of Eco-environments in Three Gorges Reservoir Region (Ministry of Education), School of Chemistry and Chemical Engineering, Southwest University, Chongqing 400715 (China); Li, Nian Bing, E-mail: linb@swu.edu.cn [Key Laboratory of Eco-environments in Three Gorges Reservoir Region (Ministry of Education), School of Chemistry and Chemical Engineering, Southwest University, Chongqing 400715 (China)

    2015-10-15

    Improving the reproducibility of electrochemical signal remains a great challenge over the past decades. In this work, i-motif oligonucleotide probe-based electrochemical DNA (E-DNA) sensor is introduced for the first time as a regenerated sensing platform, which enhances the reproducibility of electrochemical signal, for label-free detection of glucose and urea. The addition of glucose or urea is able to activate glucose oxidase-catalyzed or urease-catalyzed reaction, inducing or destroying the formation of i-motif oligonucleotide probe. The conformational switch of oligonucleotide probe can be recorded by electrochemical impedance spectroscopy. Thus, the difference of electron transfer resistance is utilized for the quantitative determination of glucose and urea. We further demonstrate that the E-DNA sensor exhibits high selectivity, excellent stability, and remarkable regenerated ability. The human serum analysis indicates that this simple and regenerated strategy holds promising potential in future biosensing applications. - Highlights: • Conformational switch of i-motif is used for the detection of glucose and urea. • The sensor can be regenerated. • The proposed method is successfully applied in real sample assay. • Our method is label-free and inexpensive.

  1. Modulation of i-motif thermodynamic stability by the introduction of UNA (unlocked nucleic acid) monomers

    DEFF Research Database (Denmark)

    Pasternak, Anna; Wengel, Jesper

    2011-01-01

    The influence of acyclic RNA derivatives, UNA (unlocked nucleic acid) monomers, on i-DNA thermodynamic stability has been investigated. The 22 nt human telomeric fragment was chosen as the model sequence for stability studies. UNA monomers modulate i-motif stability in a position-depending manner...

  2. DNA nanotechnology: On-command molecular Trojans

    Science.gov (United States)

    Niemeyer, Christof M.

    2017-12-01

    Lipid-motif-decorated DNA nanocapsules filled with photoresponsive polymers are capable of delivering signalling molecules into target organisms for biological perturbations at high spatiotemporal resolution.

  3. An essential GT motif in the lamin A promoter mediates activation by CREB-binding protein

    International Nuclear Information System (INIS)

    Janaki Ramaiah, M.; Parnaik, Veena K.

    2006-01-01

    Lamin A is an important component of nuclear architecture in mammalian cells. Mutations in the human lamin A gene lead to highly degenerative disorders that affect specific tissues. In studies directed towards understanding the mode of regulation of the lamin A promoter, we have identified an essential GT motif at -55 position by reporter gene assays and mutational analysis. Binding of this sequence to Sp transcription factors has been observed in electrophoretic mobility shift assays and by chromatin immunoprecipitation studies. Further functional analysis by co-expression of recombinant proteins and ChIP assays has shown an important regulatory role for CREB-binding protein in promoter activation, which is mediated by the GT motif

  4. Mycobacterium smegmatis PafBC is involved in regulation of DNA damage response.

    Science.gov (United States)

    Fudrini Olivencia, Begonia; Müller, Andreas U; Roschitzki, Bernd; Burger, Sibylle; Weber-Ban, Eilika; Imkamp, Frank

    2017-10-25

    Two genes, pafB and pafC, are organized in an operon with the Pup-ligase gene pafA, which is part of the Pup-proteasome system (PPS) present in mycobacteria and other actinobacteria. The PPS is crucial for Mycobacterium tuberculosis resistance towards reactive nitrogen intermediates (RNI). However, pafB and pafC apparently play only a minor role in RNI resistance. To characterize their function, we generated a pafBC deletion in Mycobacterium smegmatis (Msm). Proteome analysis of the mutant strain revealed decreased cellular levels of various proteins involved in DNA damage repair, including recombinase A (RecA). In agreement with this finding, Msm ΔpafBC displayed increased sensitivity to DNA damaging agents. In mycobacteria two pathways regulate DNA repair genes: the LexA/RecA-dependent SOS response and a predominant pathway that controls gene expression via a LexA/RecA-independent promoter, termed P1. PafB and PafC feature winged helix-turn-helix DNA binding motifs and we demonstrate that together they form a stable heterodimer in vitro, implying a function as a heterodimeric transcriptional regulator. Indeed, P1-driven transcription of recA was decreased in Msm ΔpafBC under standard conditions and induction of recA expression upon DNA damage was strongly impaired. Taken together, our data indicate an important regulatory function of PafBC in the mycobacterial DNA damage response.

  5. The MHC motif viewer: a visualization tool for MHC binding motifs

    DEFF Research Database (Denmark)

    Rapin, Nicolas; Hoof, Ilka; Lund, Ole

    2010-01-01

    is hampered by the lack of tools for browsing and comparing specificity of these molecules. We have developed a Web server, MHC Motif Viewer, which allows the display of the binding motif for MHC class I proteins for human, chimpanzee, rhesus monkey, mouse, and swine, as well as HLA-DR protein sequences...

  6. DNA nanotechnology

    Science.gov (United States)

    Seeman, Nadrian C.; Sleiman, Hanadi F.

    2018-01-01

    DNA is the molecule that stores and transmits genetic information in biological systems. The field of DNA nanotechnology takes this molecule out of its biological context and uses its information to assemble structural motifs and then to connect them together. This field has had a remarkable impact on nanoscience and nanotechnology, and has been revolutionary in our ability to control molecular self-assembly. In this Review, we summarize the approaches used to assemble DNA nanostructures and examine their emerging applications in areas such as biophysics, diagnostics, nanoparticle and protein assembly, biomolecule structure determination, drug delivery and synthetic biology. The introduction of orthogonal interactions into DNA nanostructures is discussed, and finally, a perspective on the future directions of this field is presented.

  7. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    Science.gov (United States)

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  8. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    Energy Technology Data Exchange (ETDEWEB)

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  9. Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

    Science.gov (United States)

    Meier, Daniel; Schindler, Detlev

    2011-01-01

    The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.

  10. Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

    Directory of Open Access Journals (Sweden)

    Daniel Meier

    Full Text Available The Fanconi anemia (FA gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS. In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs, and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.

  11. Specific interactions between DNA and regulatory protein controlled by ligand-binding: Ab initio molecular simulation

    Energy Technology Data Exchange (ETDEWEB)

    Matsushita, Y., E-mail: kurita@cs.tut.ac.jp; Murakawa, T., E-mail: kurita@cs.tut.ac.jp; Shimamura, K., E-mail: kurita@cs.tut.ac.jp; Oishi, M., E-mail: kurita@cs.tut.ac.jp; Ohyama, T., E-mail: kurita@cs.tut.ac.jp; Kurita, N., E-mail: kurita@cs.tut.ac.jp [Department of Computer Science and Engineering, Toyohashi University of Technology, Tempaku-cho, Toyohashi, Aichi, 441-8580 (Japan)

    2015-02-27

    The catabolite activator protein (CAP) is one of the regulatory proteins controlling the transcription mechanism of gene. Biochemical experiments elucidated that the complex of CAP with cyclic AMP (cAMP) is indispensable for controlling the mechanism, while previous molecular simulations for the monomer of CAP+cAMP complex revealed the specific interactions between CAP and cAMP. However, the effect of cAMP-binding to CAP on the specific interactions between CAP and DNA is not elucidated at atomic and electronic levels. We here considered the ternary complex of CAP, cAMP and DNA in solvating water molecules and investigated the specific interactions between them at atomic and electronic levels using ab initio molecular simulations based on classical molecular dynamics and ab initio fragment molecular orbital methods. The results highlight the important amino acid residues of CAP for the interactions between CAP and cAMP and between CAP and DNA.

  12. Specific interactions between DNA and regulatory protein controlled by ligand-binding: Ab initio molecular simulation

    International Nuclear Information System (INIS)

    Matsushita, Y.; Murakawa, T.; Shimamura, K.; Oishi, M.; Ohyama, T.; Kurita, N.

    2015-01-01

    The catabolite activator protein (CAP) is one of the regulatory proteins controlling the transcription mechanism of gene. Biochemical experiments elucidated that the complex of CAP with cyclic AMP (cAMP) is indispensable for controlling the mechanism, while previous molecular simulations for the monomer of CAP+cAMP complex revealed the specific interactions between CAP and cAMP. However, the effect of cAMP-binding to CAP on the specific interactions between CAP and DNA is not elucidated at atomic and electronic levels. We here considered the ternary complex of CAP, cAMP and DNA in solvating water molecules and investigated the specific interactions between them at atomic and electronic levels using ab initio molecular simulations based on classical molecular dynamics and ab initio fragment molecular orbital methods. The results highlight the important amino acid residues of CAP for the interactions between CAP and cAMP and between CAP and DNA

  13. The NS1 polypeptide of the murine parvovirus minute virus of mice binds to DNA sequences containing the motif [ACCA]2-3.

    Science.gov (United States)

    Cotmore, S F; Christensen, J; Nüesch, J P; Tattersall, P

    1995-03-01

    A DNA fragment containing the minute virus of mice 3' replication origin was specifically coprecipitated in immune complexes containing the virally coded NS1, but not the NS2, polypeptide. Antibodies directed against the amino- or carboxy-terminal regions of NS1 precipitated the NS1-origin complexes, but antibodies directed against NS1 amino acids 284 to 459 blocked complex formation. Using affinity-purified histidine-tagged NS1 preparations, we have shown that the specific protein-DNA interaction is of moderate affinity, being stable in 0.1 M salt but rapidly lost at higher salt concentrations. In contrast, generalized (or nonspecific) DNA binding by NS1 could be demonstrated only in low salt. Addition of ATP or gamma S-ATP enhanced specific DNA binding by wild-type NS1 severalfold, but binding was lost under conditions which favored ATP hydrolysis. NS1 molecules with mutations in a critical lysine residue (amino acid 405) in the consensus ATP-binding site bound to the origin, but this binding could not be enhanced by ATP addition. DNase I protection assays carried out with wild-type NS1 in the presence of gamma S-ATP gave footprints which extended over 43 nucleotides on both DNA strands, from the middle of the origin bubble sequence to a position some 14 bp beyond the nick site. The DNA-binding site for NS1 was mapped to a 22-bp fragment from the middle of the 3' replication origin which contains the sequence ACCAACCA. This conforms to a reiterated motif (ACCA)2-3, which occurs, in more or less degenerate form, at many sites throughout the minute virus of mice genome (J. W. Bodner, Virus Genes 2:167-182, 1989). Insertion of a single copy of the sequence (ACCA)3 was shown to be sufficient to confer NS1 binding on an otherwise unrecognized plasmid fragment. The functions of NS1 in the viral life cycle are reevaluated in the light of this result.

  14. Kopi dan Kakao dalam Kreasi Motif Batik Khas Jember

    Directory of Open Access Journals (Sweden)

    Irfa'ina Rohana Salma

    2015-06-01

    Full Text Available ABSTRAK Batik Jember selama ini identik dengan motif daun tembakau. Visualisasi daun tembakau dalam motif Batik Jember cukup lemah, yaitu kurang berkarakter karena motif yang muncul adalah seperti gambar daun pada umumnya. Oleh karena itu perlu diciptakan desain motif batik khas Jember yang sumber inspirasinya digali dari kekayaan alam lainnya dari Jember yang mempunyai bentuk spesifik dan karakteristik sehingga identitas motif bisa didapatkan dengan lebih kuat. Hasil alam khas Jember tersebut adalah kopi dan kakao. Tujuan penciptaan seni ini adalah untuk menghasilkan motif batik  baru yang mempunyai ciri khas Jember. Metode yang digunakan yaitu pengumpulan data, pengamatan mendalam terhadap objek penciptaan, pengkajian sumber inspirasi, pembuatan desain motif, dan perwujudan menjadi batik. Dari penciptaan seni ini berhasil dikreasikan 6 (enam motif batik yaitu: (1 Motif Uwoh Kopi; (2 Motif Godong Kopi;  (3 Motif Ceplok Kakao; (4 Motif Kakao Raja; (5 Motif Kakao Biru; dan (6 Motif Wiji Mukti. Berdasarkan hasil penilaian “Selera Estetika” diketahui bahwa motif yang paling banyak disukai adalah Motif Uwoh Kopi dan Motif Kakao Raja. Kata kunci: Motif Woh Kopi, Motif Godong Kopi, Motif Ceplok Kakao, Motif Kakao Raja, Motif Kakao Biru, Motif Wiji Mukti ABSTRACTBatik Jember is synonymous with tobacco leaf motif. Tobacco leaf shape is quite weak in the visual appearance characterized as that motif emerges like a picture of leaves in general. Therefore, it is necessary to create a distinctive design motif extracted from other natural resources of Jember that have specific shapes and characteristics that can be obtained as the stronger motif identity. The typical natural resources from Jember are coffee and cocoa. The purpose of the creation of this art is to produce the unique, creative and innovative batik and have specific characteristics of Jember. The method used are data collection, observation of the object, reviewing inspiration sources

  15. Robustness and backbone motif of a cancer network regulated by miR-17-92 cluster during the G1/S transition.

    Directory of Open Access Journals (Sweden)

    Lijian Yang

    Full Text Available Based on interactions among transcription factors, oncogenes, tumor suppressors and microRNAs, a Boolean model of cancer network regulated by miR-17-92 cluster is constructed, and the network is associated with the control of G1/S transition in the mammalian cell cycle. The robustness properties of this regulatory network are investigated by virtue of the Boolean network theory. It is found that, during G1/S transition in the cell cycle process, the regulatory networks are robustly constructed, and the robustness property is largely preserved with respect to small perturbations to the network. By using the unique process-based approach, the structure of this network is analyzed. It is shown that the network can be decomposed into a backbone motif which provides the main biological functions, and a remaining motif which makes the regulatory system more stable. The critical role of miR-17-92 in suppressing the G1/S cell cycle checkpoint and increasing the uncontrolled proliferation of the cancer cells by targeting a genetic network of interacting proteins is displayed with our model.

  16. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    Science.gov (United States)

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA

  17. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  18. Unlocked nucleic acids with a pyrene-modified uracil: Synthesis, hybridization studies, fluorescent properties and i-motif stability

    DEFF Research Database (Denmark)

    Perlíková, P.; Karlsen, K.K.; Pedersen, E.B.

    2014-01-01

    The synthesis of two new phosphoramidite building blocks for the incorporation of 5-(pyren-1-yl)uracilyl unlocked nucleic acid (UNA) monomers into oligonucleotides has been developed. Monomers containing a pyrene-modified nucleobase component were found to destabilize an i-motif structure at pH 5...... intensities upon hybridization to DNA or RNA. Efficient quenching of fluorescence of pyrene-modified UNA monomers was observed after formation of i-motif structures at pH 5.2. The stabilizing/destabilizing effect of pyrene-modified nucleic acids might be useful for designing antisense oligonucleotides...

  19. Aberrant DNA Methylation in Human iPSCs Associates with MYC-Binding Motifs in a Clone-Specific Manner Independent of Genetics.

    Science.gov (United States)

    Panopoulos, Athanasia D; Smith, Erin N; Arias, Angelo D; Shepard, Peter J; Hishida, Yuriko; Modesto, Veronica; Diffenderfer, Kenneth E; Conner, Clay; Biggs, William; Sandoval, Efren; D'Antonio-Chronowska, Agnieszka; Berggren, W Travis; Izpisua Belmonte, Juan Carlos; Frazer, Kelly A

    2017-04-06

    Induced pluripotent stem cells (iPSCs) show variable methylation patterns between lines, some of which reflect aberrant differences relative to embryonic stem cells (ESCs). To examine whether this aberrant methylation results from genetic variation or non-genetic mechanisms, we generated human iPSCs from monozygotic twins to investigate how genetic background, clone, and passage number contribute. We found that aberrantly methylated CpGs are enriched in regulatory regions associated with MYC protein motifs and affect gene expression. We classified differentially methylated CpGs as being associated with genetic and/or non-genetic factors (clone and passage), and we found that aberrant methylation preferentially occurs at CpGs associated with clone-specific effects. We further found that clone-specific effects play a strong role in recurrent aberrant methylation at specific CpG sites across different studies. Our results argue that a non-genetic biological mechanism underlies aberrant methylation in iPSCs and that it is likely based on a probabilistic process involving MYC that takes place during or shortly after reprogramming. Published by Elsevier Inc.

  20. Structure of the central RNA recognition motif of human TIA-1 at 1.95 A resolution

    International Nuclear Information System (INIS)

    Kumar, Amit O.; Swenson, Matthew C.; Benning, Matthew M.; Kielkopf, Clara L.

    2008-01-01

    T-cell-restricted intracellular antigen-1 (TIA-1) regulates alternative pre-mRNA splicing in the nucleus, and mRNA translation in the cytoplasm, by recognizing uridine-rich sequences of RNAs. As a step towards understanding RNA recognition by this regulatory factor, the X-ray structure of the central RNA recognition motif (RRM2) of human TIA-1 is presented at 1.95 A resolution. Comparison with structurally homologous RRM-RNA complexes identifies residues at the RNA interfaces that are conserved in TIA-1-RRM2. The versatile capability of RNP motifs to interact with either proteins or RNA is reinforced by symmetry-related protein-protein interactions mediated by the RNP motifs of TIA-1-RRM2. Importantly, the TIA-1-RRM2 structure reveals the locations of mutations responsible for inhibiting nuclear import. In contrast with previous assumptions, the mutated residues are buried within the hydrophobic interior of the domain, where they would be likely to destabilize the RRM fold rather than directly inhibit RNA binding

  1. Maternal Stress, Preterm Birth, and DNA Methylation at Imprint Regulatory Sequences in Humans

    Directory of Open Access Journals (Sweden)

    Adriana C. Vidal

    2014-01-01

    Full Text Available In infants exposed to maternal stress in utero, phenotypic plasticity through epigenetic events may mechanistically explain increased risk of preterm birth (PTB, which confers increased risk for neurodevelopmental disorders, cardiovascular disease, and cancers in adulthood. We examined associations between prenatal maternal stress and PTB, evaluating the role of DNA methylation at imprint regulatory regions. We enrolled women from prenatal clinics in Durham, NC. Stress was measured in 537 women at 12 weeks of gestation using the Perceived Stress Scale. DNA methylation at differentially methylated regions (DMRs associated with H19, IGF2, MEG3, MEST, SGCE/PEG10, PEG3, NNAT , and PLAGL1 was measured from peripheral and cord blood using bisulfite pyrosequencing in a sub-sample of 79 mother–-infant pairs. We examined associations between PTB and stress and evaluated differences in DNA methylation at each DMR by stress. Maternal stress was not associated with PTB (OR = 0.98; 95% CI, 0.40–-2.40; P = 0.96, after adjustment for maternal body mass index (BMI, income, and raised blood pressure. However, elevated stress was associated with higher infant DNA methylation at the MEST DMR (2.8% difference, P < 0.01 after adjusting for PTB. Maternal stress may be associated with epigenetic changes at MEST , a gene relevant to maternal care and obesity. Reduced prenatal stress may support the epigenomic profile of a healthy infant.

  2. Generic Properties of Random Gene Regulatory Networks.

    Science.gov (United States)

    Li, Zhiyuan; Bianco, Simone; Zhang, Zhaoyang; Tang, Chao

    2013-12-01

    Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.

  3. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    Science.gov (United States)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  4. MSDmotif: exploring protein sites and motifs

    Directory of Open Access Journals (Sweden)

    Henrick Kim

    2008-07-01

    Full Text Available Abstract Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures.

  5. Simultaneous genome-wide inference of physical, genetic, regulatory, and functional pathway components.

    Directory of Open Access Journals (Sweden)

    Christopher Y Park

    2010-11-01

    Full Text Available Biomolecular pathways are built from diverse types of pairwise interactions, ranging from physical protein-protein interactions and modifications to indirect regulatory relationships. One goal of systems biology is to bridge three aspects of this complexity: the growing body of high-throughput data assaying these interactions; the specific interactions in which individual genes participate; and the genome-wide patterns of interactions in a system of interest. Here, we describe methodology for simultaneously predicting specific types of biomolecular interactions using high-throughput genomic data. This results in a comprehensive compendium of whole-genome networks for yeast, derived from ∼3,500 experimental conditions and describing 30 interaction types, which range from general (e.g. physical or regulatory to specific (e.g. phosphorylation or transcriptional regulation. We used these networks to investigate molecular pathways in carbon metabolism and cellular transport, proposing a novel connection between glycogen breakdown and glucose utilization supported by recent publications. Additionally, 14 specific predicted interactions in DNA topological change and protein biosynthesis were experimentally validated. We analyzed the systems-level network features within all interactomes, verifying the presence of small-world properties and enrichment for recurring network motifs. This compendium of physical, synthetic, regulatory, and functional interaction networks has been made publicly available through an interactive web interface for investigators to utilize in future research at http://function.princeton.edu/bioweaver/.

  6. GANN: Genetic algorithm neural networks for the detection of conserved combinations of features in DNA

    Directory of Open Access Journals (Sweden)

    Beiko Robert G

    2005-02-01

    Full Text Available Abstract Background The multitude of motif detection algorithms developed to date have largely focused on the detection of patterns in primary sequence. Since sequence-dependent DNA structure and flexibility may also play a role in protein-DNA interactions, the simultaneous exploration of sequence- and structure-based hypotheses about the composition of binding sites and the ordering of features in a regulatory region should be considered as well. The consideration of structural features requires the development of new detection tools that can deal with data types other than primary sequence. Results GANN (available at http://bioinformatics.org.au/gann is a machine learning tool for the detection of conserved features in DNA. The software suite contains programs to extract different regions of genomic DNA from flat files and convert these sequences to indices that reflect sequence and structural composition or the presence of specific protein binding sites. The machine learning component allows the classification of different types of sequences based on subsamples of these indices, and can identify the best combinations of indices and machine learning architecture for sequence discrimination. Another key feature of GANN is the replicated splitting of data into training and test sets, and the implementation of negative controls. In validation experiments, GANN successfully merged important sequence and structural features to yield good predictive models for synthetic and real regulatory regions. Conclusion GANN is a flexible tool that can search through large sets of sequence and structural feature combinations to identify those that best characterize a set of sequences.

  7. 5' Region of the human interleukin 4 gene: structure and potential regulatory elements

    Energy Technology Data Exchange (ETDEWEB)

    Eder, A; Krafft-Czepa, H; Krammer, P H

    1988-01-25

    The lymphokine Interleukin 4 (IL-4) is secreted by antigen or mitogen activated T lymphocytes. IL-4 stimulates activation and differentiation of B lymphocytes and growth of T lymphocytes and mast cells. The authors isolated the human IL-4 gene from a lambda EMBL3 genomic library. As a probe they used a synthetic oligonucleotide spanning position 40 to 79 of the published IL-4 cDNA sequence. The 5' promoter region contains several sequence elements which may have a cis-acting regulatory function for IL-4 gene expression. These elements include a TATA-box, three CCAAT-elements (two are on the non-coding strand) and an octamer motif. A comparison of the 5' flanking region of the human murine IL-4 gene (4) shows that the region between position -306 and +44 is highly conserved (83% homology).

  8. Promoter Motifs in NCLDVs: An Evolutionary Perspective

    Directory of Open Access Journals (Sweden)

    Graziele Pereira Oliveira

    2017-01-01

    Full Text Available For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV, raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses’ evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters’ evolutionary scenarios and propose the term “MEGA-box” to designate an ancestor promoter motif (‘TATATAAAATTGA’ that could be evolved gradually by nucleotides’ gain and loss and point mutations.

  9. Promoter Motifs in NCLDVs: An Evolutionary Perspective

    Science.gov (United States)

    Oliveira, Graziele Pereira; Andrade, Ana Cláudia dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

    2017-01-01

    For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses’ evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters’ evolutionary scenarios and propose the term “MEGA-box” to designate an ancestor promoter motif (‘TATATAAAATTGA’) that could be evolved gradually by nucleotides’ gain and loss and point mutations. PMID:28117683

  10. Sites of instability in the human TCF3 (E2A) gene adopt G-quadruplex DNA structures in vitro

    Science.gov (United States)

    Williams, Jonathan D.; Fleetwood, Sara; Berroyer, Alexandra; Kim, Nayun; Larson, Erik D.

    2015-01-01

    The formation of highly stable four-stranded DNA, called G-quadruplex (G4), promotes site-specific genome instability. G4 DNA structures fold from repetitive guanine sequences, and increasing experimental evidence connects G4 sequence motifs with specific gene rearrangements. The human transcription factor 3 (TCF3) gene (also termed E2A) is subject to genetic instability associated with severe disease, most notably a common translocation event t(1;19) associated with acute lymphoblastic leukemia. The sites of instability in TCF3 are not randomly distributed, but focused to certain sequences. We asked if G4 DNA formation could explain why TCF3 is prone to recombination and mutagenesis. Here we demonstrate that sequences surrounding the major t(1;19) break site and a region associated with copy number variations both contain G4 sequence motifs. The motifs identified readily adopt G4 DNA structures that are stable enough to interfere with DNA synthesis in physiological salt conditions in vitro. When introduced into the yeast genome, TCF3 G4 motifs promoted gross chromosomal rearrangements in a transcription-dependent manner. Our results provide a molecular rationale for the site-specific instability of human TCF3, suggesting that G4 DNA structures contribute to oncogenic DNA breaks and recombination. PMID:26029241

  11. Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

    Science.gov (United States)

    Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

    2009-02-01

    Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.

  12. Helping Students Understand Gene Regulation with Online Tools: A Review of MEME and Melina II, Motif Discovery Tools for Active Learning in Biology

    Directory of Open Access Journals (Sweden)

    David Treves

    2012-08-01

    Full Text Available Review of: MEME and Melina II, which are two free and easy-to-use online motif discovery tools that can be employed to actively engage students in learning about gene regulatory elements.

  13. cWords - systematic microRNA regulatory motif discovery from mRNA expression data

    DEFF Research Database (Denmark)

    Rasmussen, Simon Horskjær; Jacobsen, Anders; Krogh, Anders

    2013-01-01

    and statistical methods of cWords, resulting in at least a factor 100 speed gain over the previous implementation. On a benchmark dataset of 19 microRNA (miRNA) perturbation experiments cWords showed equal or better performance than two comparable methods, miReduce and Sylamer. We have developed rigorous motif...... that demonstrate comparable or better performance than other existing methods. Rich visualization of results promotes intuitive and efficient interpretation of data. cWords is available as a stand-alone Open Source program at Github https://github.com/simras/cWords webcite and as a web-service at: http...

  14. Statistical tests to compare motif count exceptionalities

    Directory of Open Access Journals (Sweden)

    Vandewalle Vincent

    2007-03-01

    Full Text Available Abstract Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.

  15. [Regulatory effect and mechanism of RNA binding motif protein 38 on the expression of progesterone receptor in human breast cancer ZR-75-1 cells].

    Science.gov (United States)

    Lou, P P; Li, C L; Xia, T S; Shi, L; Wu, J; Zhou, X J; Wang, Y; Ding, Q

    2016-06-23

    To investigate the regulatory mechanism of RNA binding motif protein 38 (RNPC1) on the expression of progesterone receptor (PR) in breast cancer cell line ZR-75-1. Lentiviral vector was used to induce overexpression of RNPC1 in ZR-75-1 cells. qRT-PCR and Western blot were used to assess the regulatory effect of RNPC1 on PR expression. Actinomycin was used to detect the regulatory mechanism involved. Immunohistochemical (IHC) staining was used to determine the protein expression of RNPC1 and PR in 80 breast cancer tissues. IHC staining showed that the expression of RNPC1 was significantly higher in the PR positive breast cancer tissues than that in the PR negative breast cancer tissues (P<0.05). The qRT-PCR results showed that overexpression of RNPC1 in ZR-75-1 cells significantly upregulated the mRNA level of PR (1.764±0.028 vs. 1.001±0.037, P<0.01), whereas knockdown of RNPC1 did the opposite (0.579± 0.007 vs. 1.000±0.002, P<0.01). The Western blot results also showed that overexpression of RNPC1 up-regulated PR levels, while knockdown of RNPC1 resulted in down-regulation of PR levels in the ZR-75-1 cells.The actinomycin assay showed that overexpression of RNPC1 increased the mRNA stability of PR. The half-life of PR mRNA was increased from 4.0 h to 6.5 h. Knockdown of RNPC1 decreased the mRNA stability of PR and the half-life of PR transcript was decreased from 4.1 h to 3.0 h. RNPC1 plays a crucial role in regulating the expression of PR in breast cancer ZR-75-1 cells.

  16. Polymerase chain reaction-mediated DNA fingerprinting for epidemiological studies on Campylobacter spp

    NARCIS (Netherlands)

    Giesendorf, B A; Goossens, H; Niesters, H G; Van Belkum, A; Koeken, A; Endtz, H P; Stegeman, H; Quint, W G

    The applicability of polymerase chain reaction (PCR)-mediated DNA typing, with primers complementary to dispersed repetitive DNA sequences and arbitrarily chosen DNA motifs, to study the epidemiology of campylobacter infection was evaluated. With a single PCR reaction and simple gel electrophoresis,

  17. A recoding method to improve the humoral immune response to an HIV DNA vaccine.

    Directory of Open Access Journals (Sweden)

    Yaoxing Huang

    Full Text Available This manuscript describes a novel strategy to improve HIV DNA vaccine design. Employing a new information theory based bioinformatic algorithm, we identify a set of nucleotide motifs which are common in the coding region of HIV, but are under-represented in genes that are highly expressed in the human genome. We hypothesize that these motifs contribute to the poor protein expression of gag, pol, and env genes from the c-DNAs of HIV clinical isolates. Using this approach and beginning with a codon optimized consensus gag gene, we recode the nucleotide sequence so as to remove these motifs without modifying the amino acid sequence. Transfecting the recoded DNA sequence into a human kidney cell line results in doubling the gag protein expression level compared to the codon optimized version. We then turn both sequences into DNA vaccines and compare induced antibody response in a murine model. Our sequence, which has the motifs removed, induces a five-fold increase in gag antibody response compared to the codon optimized vaccine.

  18. Extensive evolutionary changes in regulatory element activity during human origins are associated with altered gene expression and positive selection.

    Directory of Open Access Journals (Sweden)

    Yoichiro Shibata

    2012-06-01

    Full Text Available Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species.

  19. Evolutionary dynamics of DNA-binding sites and direct target genes of a floral master regulatory transcription factor [RNA-Seq

    NARCIS (Netherlands)

    Muiño, J.M.; Bruijn, de S.A.; Vingron, Martin; Angenent, G.C.; Kaufmann, Kerstin

    2015-01-01

    Plant development is controlled by transcription factors (TFs) which form complex gene-regulatory networks. Genome-wide TF DNA-binding studies revealed that these TFs have several thousands of binding sites in the Arabidopsis genome, and may regulate the expression of many genes directly. Given the

  20. Rapid identification of DNA-binding proteins by mass spectrometry

    DEFF Research Database (Denmark)

    Nordhoff, E.; Korgsdam, A.-M.; Jørgensen, H.F.

    1999-01-01

    We report a protocol for the rapid identification of DNA-binding proteins. Immobilized DNA probes harboring a specific sequence motif are incubated with cell or nuclear extract. Proteins are analyzed directly off the solid support by matrix-assisted laser desorption/ionization time-of-flight mass...... was validated by the identification of known prokaryotic and eukaryotic DNA-binding proteins, and its use provided evidence that poly(ADP-ribose) polymerase exhibits DNA sequence-specific binding to DNA....

  1. Identity and functions of CxxC-derived motifs.

    Science.gov (United States)

    Fomenko, Dmitri E; Gladyshev, Vadim N

    2003-09-30

    Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.

  2. How We Make DNA Origami.

    Science.gov (United States)

    Wagenbauer, Klaus F; Engelhardt, Floris A S; Stahl, Evi; Hechtl, Vera K; Stömmer, Pierre; Seebacher, Fabian; Meregalli, Letizia; Ketterer, Philip; Gerling, Thomas; Dietz, Hendrik

    2017-10-05

    DNA origami has attracted substantial attention since its invention ten years ago, due to the seemingly infinite possibilities that it affords for creating customized nanoscale objects. Although the basic concept of DNA origami is easy to understand, using custom DNA origami in practical applications requires detailed know-how for designing and producing the particles with sufficient quality and for preparing them at appropriate concentrations with the necessary degree of purity in custom environments. Such know-how is not readily available for newcomers to the field, thus slowing down the rate at which new applications outside the field of DNA nanotechnology may emerge. To foster faster progress, we share in this article the experience in making and preparing DNA origami that we have accumulated over recent years. We discuss design solutions for creating advanced structural motifs including corners and various types of hinges that expand the design space for the more rigid multilayer DNA origami and provide guidelines for preventing undesired aggregation and on how to induce specific oligomerization of multiple DNA origami building blocks. In addition, we provide detailed protocols and discuss the expected results for five key methods that allow efficient and damage-free preparation of DNA origami. These methods are agarose-gel purification, filtration through molecular cut-off membranes, PEG precipitation, size-exclusion chromatography, and ultracentrifugation-based sedimentation. The guide for creating advanced design motifs and the detailed protocols with their experimental characterization that we describe here should lower the barrier for researchers to accomplish the full DNA origami production workflow. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Motif decomposition of the phosphotyrosine proteome reveals a new N-terminal binding motif for SHIP2

    DEFF Research Database (Denmark)

    Miller, Martin Lee; Hanke, S.; Hinsby, A. M.

    2008-01-01

    set of 481 unique phosphotyrosine (Tyr(P)) peptides by sequence similarity to known ligands of the Src homology 2 (SH2) and the phosphotyrosine binding (PTB) domains. From 20 clusters we extracted 16 known and four new interaction motifs. Using quantitative mass spectrometry we pulled down Tyr......(P)-specific binding partners for peptides corresponding to the extracted motifs. We confirmed numerous previously known interaction motifs and found 15 new interactions mediated by phosphosites not previously known to bind SH2 or PTB. Remarkably, a novel hydrophobic N-terminal motif ((L/V/I)(L/V/I)pY) was identified...

  4. Molecular Detection, Phylogenetic Analysis, and Identification of Transcription Motifs in Feline Leukemia Virus from Naturally Infected Cats in Malaysia

    Directory of Open Access Journals (Sweden)

    Faruku Bande

    2014-01-01

    Full Text Available A nested PCR assay was used to determine the viral RNA and proviral DNA status of naturally infected cats. Selected samples that were FeLV-positive by PCR were subjected to sequencing, phylogenetic analysis, and motifs search. Of the 39 samples that were positive for FeLV p27 antigen, 87.2% (34/39 were confirmed positive with nested PCR. FeLV proviral DNA was detected in 38 (97.3% of p27-antigen negative samples. Malaysian FeLV isolates are found to be highly similar with a homology of 91% to 100%. Phylogenetic analysis revealed that Malaysian FeLV isolates divided into two clusters, with a majority (86.2% sharing similarity with FeLV-K01803 and fewer isolates (13.8% with FeLV-GM1 strain. Different enhancer motifs including NF-GMa, Krox-20/WT1I-del2, BAF1, AP-2, TBP, TFIIF-beta, TRF, and TFIID are found to occur either in single, duplicate, triplicate, or sets of 5 in different positions within the U3-LTR-gag region. The present result confirms the occurrence of FeLV viral RNA and provirus DNA in naturally infected cats. Malaysian FeLV isolates are highly similar, and a majority of them are closely related to a UK isolate. This study provides the first molecular based information on FeLV in Malaysia. Additionally, different enhancer motifs likely associated with FeLV related pathogenesis have been identified.

  5. DNA-binding site of major regulatory protein alpha 4 specifically associated with promoter-regulatory domains of alpha genes of herpes simplex virus type 1.

    OpenAIRE

    Kristie, T M; Roizman, B

    1986-01-01

    Herpes simplex virus type 1 genes form at least five groups (alpha, beta 1, beta 2, gamma 1, and gamma 2) whose expression is coordinately regulated and sequentially ordered in a cascade fashion. Previous studies have shown that functional alpha 4 gene product is essential for the transition from alpha to beta protein synthesis and have suggested that alpha 4 gene expression is autoregulatory. We have previously reported that labeled DNA fragments containing promoter-regulatory domains of thr...

  6. Interaction of the Sliding Clamp β-Subunit and Hda, a DnaA-Related Protein

    Science.gov (United States)

    Kurz, Mareike; Dalrymple, Brian; Wijffels, Gene; Kongsuwan, Kritaya

    2004-01-01

    In Escherichia coli, interactions between the replication initiation protein DnaA, the β subunit of DNA polymerase III (the sliding clamp protein), and Hda, the recently identified DnaA-related protein, are required to convert the active ATP-bound form of DnaA to an inactive ADP-bound form through the accelerated hydrolysis of ATP. This rapid hydrolysis of ATP is proposed to be the main mechanism that blocks multiple initiations during cell cycle and acts as a molecular switch from initiation to replication. However, the biochemical mechanism for this crucial step in DNA synthesis has not been resolved. Using purified Hda and β proteins in a plate binding assay and Ni-nitrilotriacetic acid pulldown analysis, we show for the first time that Hda directly interacts with β in vitro. A new β-binding motif, a hexapeptide with the consensus sequence QL[SP]LPL, related to the previously identified β-binding pentapeptide motif (QL[SD]LF) was found in the amino terminus of the Hda protein. Mutants of Hda with amino acid changes in the hexapeptide motif are severely defective in their ability to bind β. A 10-amino-acid peptide containing the E. coli Hda β-binding motif was shown to compete with Hda for binding to β in an Hda-β interaction assay. These results establish that the interaction of Hda with β is mediated through the hexapeptide sequence. We propose that this interaction may be crucial to the events that lead to the inactivation of DnaA and the prevention of excess initiation of rounds of replication. PMID:15150238

  7. Interaction of the sliding clamp beta-subunit and Hda, a DnaA-related protein.

    Science.gov (United States)

    Kurz, Mareike; Dalrymple, Brian; Wijffels, Gene; Kongsuwan, Kritaya

    2004-06-01

    In Escherichia coli, interactions between the replication initiation protein DnaA, the beta subunit of DNA polymerase III (the sliding clamp protein), and Hda, the recently identified DnaA-related protein, are required to convert the active ATP-bound form of DnaA to an inactive ADP-bound form through the accelerated hydrolysis of ATP. This rapid hydrolysis of ATP is proposed to be the main mechanism that blocks multiple initiations during cell cycle and acts as a molecular switch from initiation to replication. However, the biochemical mechanism for this crucial step in DNA synthesis has not been resolved. Using purified Hda and beta proteins in a plate binding assay and Ni-nitrilotriacetic acid pulldown analysis, we show for the first time that Hda directly interacts with beta in vitro. A new beta-binding motif, a hexapeptide with the consensus sequence QL[SP]LPL, related to the previously identified beta-binding pentapeptide motif (QL[SD]LF) was found in the amino terminus of the Hda protein. Mutants of Hda with amino acid changes in the hexapeptide motif are severely defective in their ability to bind beta. A 10-amino-acid peptide containing the E. coli Hda beta-binding motif was shown to compete with Hda for binding to beta in an Hda-beta interaction assay. These results establish that the interaction of Hda with beta is mediated through the hexapeptide sequence. We propose that this interaction may be crucial to the events that lead to the inactivation of DnaA and the prevention of excess initiation of rounds of replication.

  8. [Personal motif in art].

    Science.gov (United States)

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.

  9. Temporal motifs in time-dependent networks

    International Nuclear Information System (INIS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-01-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological–temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network

  10. GNG Motifs Can Replace a GGG Stretch during G-Quadruplex Formation in a Context Dependent Manner.

    Directory of Open Access Journals (Sweden)

    Kohal Das

    Full Text Available G-quadruplexes are one of the most commonly studied non-B DNA structures. Generally, these structures are formed using a minimum of 4, three guanine tracts, with connecting loops ranging from one to seven. Recent studies have reported deviation from this general convention. One such deviation is the involvement of bulges in the guanine tracts. In this study, guanines along with bulges, also referred to as GNG motifs have been extensively studied using recently reported HOX11 breakpoint fragile region I as a model template. By strategic mutagenesis approach we show that the contribution from continuous G-tracts may be dispensible during G-quadruplex formation when such motifs are flanked by GNGs. Importantly, the positioning and number of GNG/GNGNG can also influence the formation of G-quadruplexes. Further, we assessed three genomic regions from HIF1 alpha, VEGF and SHOX gene for G-quadruplex formation using GNG motifs. We show that HIF1 alpha sequence harbouring GNG motifs can fold into intramolecular G-quadruplex. In contrast, GNG motifs in mutant VEGF sequence could not participate in structure formation, suggesting that the usage of GNG is context dependent. Importantly, we show that when two continuous stretches of guanines are flanked by two independent GNG motifs in a naturally occurring sequence (SHOX, it can fold into an intramolecular G-quadruplex. Finally, we show the specific binding of G-quadruplex binding protein, Nucleolin and G-quadruplex antibody, BG4 to SHOX G-quadruplex. Overall, our study provides novel insights into the role of GNG motifs in G-quadruplex structure formation which may have both physiological and pathological implications.

  11. Identification of group specific motifs in Beta-lactamase family of proteins

    Directory of Open Access Journals (Sweden)

    Saxena Akansha

    2009-12-01

    Full Text Available Abstract Background Beta-lactamases are one of the most serious threats to public health. In order to combat this threat we need to study the molecular and functional diversity of these enzymes and identify signatures specific to these enzymes. These signatures will enable us to develop inhibitors and diagnostic probes specific to lactamases. The existing classification of beta-lactamases was developed nearly 30 years ago when few lactamases were available. DLact database contain more than 2000 beta-lactamase, which can be used to study the molecular diversity and to identify signatures specific to this family. Methods A set of 2020 beta-lactamase proteins available in the DLact database http://59.160.102.202/DLact were classified using graph-based clustering of Best Bi-Directional Hits. Non-redundant (> 90 percent identical protein sequences from each group were aligned using T-Coffee and annotated using information available in literature. Motifs specific to each group were predicted using PRATT program. Results The graph-based classification of beta-lactamase proteins resulted in the formation of six groups (Four major groups containing 191, 726, 774 and 73 proteins while two minor groups containing 50 and 8 proteins. Based on the information available in literature, we found that each of the four major groups correspond to the four classes proposed by Ambler. The two minor groups were novel and do not contain molecular signatures of beta-lactamase proteins reported in literature. The group-specific motifs showed high sensitivity (> 70% and very high specificity (> 90%. The motifs from three groups (corresponding to class A, C and D had a high level of conservation at DNA as well as protein level whereas the motifs from the fourth group (corresponding to class B showed conservation at only protein level. Conclusion The graph-based classification of beta-lactamase proteins corresponds with the classification proposed by Ambler, thus there is

  12. Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

    Science.gov (United States)

    Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

    2015-01-01

    The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143

  13. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  14. Discovering regulatory motifs in the Plasmodium genome using comparative genomics

    OpenAIRE

    Wu, Jie; Sieglaff, Douglas H.; Gervin, Joshua; Xie, Xiaohui S.

    2008-01-01

    Motivation: Understanding gene regulation in Plasmodium, the causative agent of malaria, is an important step in deciphering its complex life cycle as well as leading to possible new targets for therapeutic applications. Very little is known about gene regulation in Plasmodium, and in particular, few regulatory elements have been identified. Such discovery has been significantly hampered by the high A-T content of some of the genomes of Plasmodium species, as well as the challenge in associat...

  15. The Mapping of Predicted Triplex DNA:RNA in the Drosophila Genome Reveals a Prominent Location in Development- and Morphogenesis-Related Genes

    Directory of Open Access Journals (Sweden)

    Claude Pasquier

    2017-07-01

    Full Text Available Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand. A nucleic acid triplex occurs according to Hoogsteen rules that predict the stability and affinity of the third strand bound to the Watson–Crick duplex. The “triplex-forming oligonucleotide” (TFO can be a short sequence of RNA that binds to the major groove of the targeted duplex only when this duplex presents a sequence of purine or pyrimidine bases in one of the DNA strands. Many nuclear proteins are known to bind triplex DNA or DNA:RNA, but their biological functions are unexplored. We identified sequences that are capable of engaging as the “triplex-forming oligonucleotide” in both the pre-lncRNA and pre-mRNA collections of Drosophila melanogaster. These motifs were matched against the Drosophila genome in order to identify putative sequences of triplex formation in intergenic regions, promoters, and introns/exons. Most of the identified TFOs appear to be located in the intronic region of the analyzed genes. Computational prediction of the most targeted genes by TFOs originating from pre-lncRNAs and pre-mRNAs revealed that they are restrictively associated with development- and morphogenesis-related gene networks. The refined analysis by Gene Ontology enrichment demonstrates that some individual TFOs present genome-wide scale matches that are located in numerous genes and regulatory sequences. The triplex DNA:RNA computational mapping at the genome-wide scale suggests broad interference in the regulatory process of the gene networks orchestrated by TFO RNAs acting in association simultaneously at multiple sites.

  16. Transcription regulatory networks analysis using CAGE

    KAUST Repository

    Tegnér, Jesper N.

    2009-10-01

    Mapping out cellular networks in general and transcriptional networks in particular has proved to be a bottle-neck hampering our understanding of biological processes. Integrative approaches fusing computational and experimental technologies for decoding transcriptional networks at a high level of resolution is therefore of uttermost importance. Yet, this is challenging since the control of gene expression in eukaryotes is a complex multi-level process influenced by several epigenetic factors and the fine interplay between regulatory proteins and the promoter structure governing the combinatorial regulation of gene expression. In this chapter we review how the CAGE data can be integrated with other measurements such as expression, physical interactions and computational prediction of regulatory motifs, which together can provide a genome-wide picture of eukaryotic transcriptional regulatory networks at a new level of resolution. © 2010 by Pan Stanford Publishing Pte. Ltd. All rights reserved.

  17. Evolutionary dynamics of DNA-binding sites and direct target genes of a floral master regulatory transcription factor [ChIP-Seq

    NARCIS (Netherlands)

    Muiño, J.M.; Bruijn, de S.A.; Vingron, Martin; Angenent, G.C.; Kaufmann, K.

    2015-01-01

    Plant development is controlled by transcription factors (TFs) which form complex gene-regulatory networks. Genome-wide TF DNA-binding studies revealed that these TFs have several thousands of binding sites in the Arabidopsis genome, and may regulate the expression of many genes directly. Given the

  18. oPOSSUM: integrated tools for analysis of regulatory motif over-representation

    Science.gov (United States)

    Ho Sui, Shannan J.; Fulton, Debra L.; Arenillas, David J.; Kwon, Andrew T.; Wasserman, Wyeth W.

    2007-01-01

    The identification of over-represented transcription factor binding sites from sets of co-expressed genes provides insights into the mechanisms of regulation for diverse biological contexts. oPOSSUM, an internet-based system for such studies of regulation, has been improved and expanded in this new release. New features include a worm-specific version for investigating binding sites conserved between Caenorhabditis elegans and C. briggsae, as well as a yeast-specific version for the analysis of co-expressed sets of Saccharomyces cerevisiae genes. The human and mouse applications feature improvements in ortholog mapping, sequence alignments and the delineation of multiple alternative promoters. oPOSSUM2, introduced for the analysis of over-represented combinations of motifs in human and mouse genes, has been integrated with the original oPOSSUM system. Analysis using user-defined background gene sets is now supported. The transcription factor binding site models have been updated to include new profiles from the JASPAR database. oPOSSUM is available at http://www.cisreg.ca/oPOSSUM/ PMID:17576675

  19. The regulatory G4 motif of the Kirsten ras (KRAS) gene is sensitive to guanine oxidation

    DEFF Research Database (Denmark)

    Cogoi, Susanna; Ferino, Annalisa; Miglietta, Giulia

    2018-01-01

    KRAS is one of the most mutated genes in human cancer. It is controlled by a G4 motif located upstream of the transcription start site. In this paper, we demonstrate that 8-oxoguanine (8-oxoG), being more abundant in G4 than in non-G4 regions, is a new player in the regulation of this oncogene. W...

  20. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  1. UKIRAN KERAWANG ACEH GAYO SEBAGAI INSPIRASI PENCIPTAAN MOTIF BATIK KHAS GAYO

    Directory of Open Access Journals (Sweden)

    Irfa ina Rohana Salma

    2016-12-01

    Full Text Available ABSTRAK Industri batik mulai berkembang di Gayo, tetapi belum memiliki motif batik khas daerah. Oleh karena itu perlu diciptakan motif batik khas Gayo, dengan mengambil inspirasi dari ukiran yang terdapat pada rumah tradisional yang biasa disebut ukiran kerawang Gayo. Tujuan penciptaan seni ini adalah untuk menciptakan motif batik yang memiliki ciri khas Gayo. Metode yang digunakan yaitu eksplorasi ide, perancangan, dan perwujudan menjadi motif batik. Dalam kegiatan ini telah diciptakan enam motif batik khas Gayo yaitu: (1 Motif Ceplok Gayo; (2 Motif Gayo Tegak; (3 Motif Gayo Lurus; (4 Motif Parang Gayo; (5 Motif Gayo Lembut; dan (6 Motif Geometris Gayo. Hasil uji kesukaan terhadap motif kepada lima puluh responden menunjukkan bahwa Motif Ceplok Gayo paling banyak dipilih oleh responden yaitu sebesar 19%, sedangkan Motif Parang Gayo 18%, Motif Gayo Lembut 17%, Motif Geometris Gayo 17%, Motif Gayo Lurus 15% dan Motif Gayo Tegak 14%. Rata-rata motif yang dihasilkan mendapatkan apresiasi yang baik dari responden, sehingga semua motif layak diproduksi sebagai batik khas Gayo.Kata kunci: batik Gayo, Motif Ceplok Gayo, Motif Parang Gayo.ABSTRACTBatik industry began to develop in Gayo, but have not had a typical batik motif itself. Therefore, it is necessary to create batik motifs of Gayo, by taking inspiration from the carvings found in traditional houses commonly called kerawang Gayo. The purpose of this art is to create motifs those have a Gayo characteristic. The method used are the idea exploration, design, and motifs embodiment. In this activity has created six Gayo batik motifs, namely: (1 Motif Ceplok Gayo; (2 Motif Gayo Tegak; (3 Motif GayoLurus; (4 Motif Parang Gayo; (5 Motif Gayo Lembut; dan (6 Motif Geometris Gayo. The test results fondness of the motives to fifty respondents indicated that the Motif Ceplok Gayo most preferred by respondents ie 19%, while Motif Parang Gayo 18%, Motif Gayo Lembut 17%, Motif Geometris Gayo 17%, Motif Gayo

  2. DNA-imprinted polymer nanoparticles with monodispersity and prescribed DNA-strand patterns

    Science.gov (United States)

    Trinh, Tuan; Liao, Chenyi; Toader, Violeta; Barłóg, Maciej; Bazzi, Hassan S.; Li, Jianing; Sleiman, Hanadi F.

    2018-02-01

    As colloidal self-assembly increasingly approaches the complexity of natural systems, an ongoing challenge is to generate non-centrosymmetric structures. For example, patchy, Janus or living crystallization particles have significantly advanced the area of polymer assembly. It has remained difficult, however, to devise polymer particles that associate in a directional manner, with controlled valency and recognition motifs. Here, we present a method to transfer DNA patterns from a DNA cage to a polymeric nanoparticle encapsulated inside the cage in three dimensions. The resulting DNA-imprinted particles (DIPs), which are 'moulded' on the inside of the DNA cage, consist of a monodisperse crosslinked polymer core with a predetermined pattern of different DNA strands covalently 'printed' on their exterior, and further assemble with programmability and directionality. The number, orientation and sequence of DNA strands grafted onto the polymeric core can be controlled during the process, and the strands are addressable independently of each other.

  3. Computational and molecular dissection of an X-box cis-Regulatory module

    OpenAIRE

    Warrington, Timothy Burton

    2015-01-01

    Ciliopathies are a class of human diseases marked by dysfunction of the cellular organelle, cilia. While many of the molecular components that make up cilia have been identified and studied, comparatively little is understood about the transcriptional regulation of genes encoding these components. The conserved transcription factor Regulatory Factor X (RFX)/DAF-19, which acts through binding to the cis-regulatory motif known as X-box, has been shown to regulate ciliary genes in many animals f...

  4. Functional interaction of the DNA-binding transcription factor Sp1 through its DNA-binding domain with the histone chaperone TAF-I.

    Science.gov (United States)

    Suzuki, Toru; Muto, Shinsuke; Miyamoto, Saku; Aizawa, Kenichi; Horikoshi, Masami; Nagai, Ryozo

    2003-08-01

    Transcription involves molecular interactions between general and regulatory transcription factors with further regulation by protein-protein interactions (e.g. transcriptional cofactors). Here we describe functional interaction between DNA-binding transcription factor and histone chaperone. Affinity purification of factors interacting with the DNA-binding domain of the transcription factor Sp1 showed Sp1 to interact with the histone chaperone TAF-I, both alpha and beta isoforms. This interaction was specific as Sp1 did not interact with another histone chaperone CIA nor did other tested DNA-binding regulatory factors (MyoD, NFkappaB, p53) interact with TAF-I. Interaction of Sp1 and TAF-I occurs both in vitro and in vivo. Interaction with TAF-I results in inhibition of DNA-binding, and also likely as a result of such, inhibition of promoter activation by Sp1. Collectively, we describe interaction between DNA-binding transcription factor and histone chaperone which results in negative regulation of the former. This novel regulatory interaction advances our understanding of the mechanisms of eukaryotic transcription through DNA-binding regulatory transcription factors by protein-protein interactions, and also shows the DNA-binding domain to mediate important regulatory interactions.

  5. A Novel Dual-cre Motif Enables Two-Way Autoregulation of CcpA in Clostridium acetobutylicum.

    Science.gov (United States)

    Zhang, Lu; Liu, Yanqiang; Yang, Yunpeng; Jiang, Weihong; Gu, Yang

    2018-04-15

    The master regulator CcpA (catabolite control protein A) manages a large and complex regulatory network that is essential for cellular physiology and metabolism in Gram-positive bacteria. Although CcpA can affect the expression of target genes by binding to a cis -acting catabolite-responsive element ( cre ), whether and how the expression of CcpA is regulated remain poorly explored. Here, we report a novel dual- cre motif that is employed by the CcpA in Clostridium acetobutylicum , a typical solventogenic Clostridium species, for autoregulation. Two cre sites are involved in CcpA autoregulation, and they reside in the promoter and coding regions of CcpA. In this dual- cre motif, cre P , in the promoter region, positively regulates ccpA transcription, whereas cre ORF , in the coding region, negatively regulates this transcription, thus enabling two-way autoregulation of CcpA. Although CcpA bound cre P more strongly than cre ORF in vitro , the in vivo assay showed that cre ORF -based repression dominates CcpA autoregulation during the entire fermentation. Finally, a synonymous mutation of cre ORF was made within the coding region, achieving an increased intracellular CcpA expression and improved cellular performance. This study provides new insights into the regulatory role of CcpA in C. acetobutylicum and, moreover, contributes a new engineering strategy for this industrial strain. IMPORTANCE CcpA is known to be a key transcription factor in Gram-positive bacteria. However, it is still unclear whether and how the intracellular CcpA level is regulated, which may be essential for maintaining normal cell physiology and metabolism. We discovered here that CcpA employs a dual- cre motif to autoregulate, enabling dynamic control of its own expression level during the entire fermentation process. This finding answers the questions above and fills a void in our understanding of the regulatory network of CcpA. Interference in CcpA autoregulation leads to improved cellular

  6. Environmental influences on DNA curvature

    DEFF Research Database (Denmark)

    Ussery, David; Higgins, C.F.; Bolshoy, A.

    1999-01-01

    DNA curvature plays an important role in many biological processes. To study environmentalinfluences on DNA curvature we compared the anomalous migration on polyacrylamide gels ofligation ladders of 11 specifically-designed oligonucleotides. At low temperatures (25 degreesC and below) most......, whilst spermine enhanced theanomalous migration of a different set of sequences. Sequences with a GGC motif exhibitedgreater curvature than predicted by the presently-used angles for the nearest-neighbour wedgemodel and are especially sensitive to Mg2+. The data have implications for models...... for DNAcurvature and for environmentally-sensitive DNA conformations in the regulation of geneexpression....

  7. Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks

    KAUST Repository

    Umarov, Ramzan; Solovyev, Victor

    2017-01-01

    Accurate computational identification of promoters remains a challenge as these key DNA regulatory regions have variable structures composed of functional motifs that provide gene-specific initiation of transcription. In this paper we utilize

  8. DNA Methylation Analysis of HTR2A Regulatory Region in Leukocytes of Autistic Subjects.

    Science.gov (United States)

    Hranilovic, Dubravka; Blazevic, Sofia; Stefulj, Jasminka; Zill, Peter

    2016-02-01

    Disturbed brain and peripheral serotonin homeostasis is often found in subjects with autism spectrum disorder (ASD). The role of the serotonin receptor 2A (HTR2A) in the regulation of central and peripheral serotonin homeostasis, as well as its altered expression in autistic subjects, have implicated the HTR2A gene as a major candidate for the serotonin disturbance seen in autism. Several studies, yielding so far inconclusive results, have attempted to associate autism with a functional SNP -1438 G/A (rs6311) in the HTR2A promoter region, while possible contribution of epigenetic mechanisms, such as DNA methylation, to HTR2A dysregulation in autism has not yet been investigated. In this study, we compared the mean DNA methylation within the regulatory region of the HTR2A gene between autistic and control subjects. DNA methylation was analysed in peripheral blood leukocytes using bisulfite conversion and sequencing of the HTR2A region containing rs6311 polymorphism. Autistic subjects of rs6311 AG genotype displayed higher mean methylation levels within the analysed region than the corresponding controls (P epigenetic mechanisms might contribute to HTR2A dysregulation observed in individuals with ASD. © 2015 International Society for Autism Research, Wiley Periodicals, Inc.

  9. Nutritional control of gene expression in Drosophila larvae via TOR, Myc and a novel cis-regulatory element

    Directory of Open Access Journals (Sweden)

    Grewal Savraj S

    2010-01-01

    Full Text Available Abstract Background Nutrient availability is a key determinant of eukaryotic cell growth. In unicellular organisms many signaling and transcriptional networks link nutrient availability to the expression of metabolic genes required for growth. However, less is known about the corresponding mechanisms that operate in metazoans. We used gene expression profiling to explore this issue in developing Drosophila larvae. Results We found that starvation for dietary amino acids (AA's leads to dynamic changes in transcript levels of many metabolic genes. The conserved insulin/PI3K and TOR signaling pathways mediate nutrition-dependent growth in Drosophila and other animals. We found that many AA starvation-responsive transcripts were also altered in TOR mutants. In contrast, although PI3K overexpression induced robust changes in the expression of many metabolic genes, these changes showed limited overlap with the AA starvation expression profile. We did however identify a strong overlap between genes regulated by the transcription factor, Myc, and AA starvation-responsive genes, particularly those involved in ribosome biogenesis, protein synthesis and mitochondrial function. The consensus Myc DNA binding site is enriched in promoters of these AA starvation genes, and we found that Myc overexpression could bypass dietary AA to induce expression of these genes. We also identified another sequence motif (Motif 1 enriched in the promoters of AA starvation-responsive genes. We showed that Motif 1 was both necessary and sufficient to mediate transcriptional responses to dietary AA in larvae. Conclusions Our data suggest that many of the transcriptional effects of amino acids are mediated via signaling through the TOR pathway in Drosophila larvae. We also find that these transcriptional effects are mediated through at least two mechanisms: via the transcription factor Myc, and via the Motif 1 cis-regulatory element. These studies begin to elucidate a nutrient

  10. MHC motif viewer

    DEFF Research Database (Denmark)

    Rapin, Nicolas Philippe Jean-Pierre; Hoof, Ilka; Lund, Ole

    2008-01-01

    . Algorithms that predict which peptides MHC molecules bind have recently been developed and cover many different alleles, but the utility of these algorithms is hampered by the lack of tools for browsing and comparing the specificity of these molecules. We have, therefore, developed a web server, MHC motif....... A special viewing feature, MHC fight, allows for display of the specificity of two different MHC molecules side by side. We show how the web server can be used to discover and display surprising similarities as well as differences between MHC molecules within and between different species. The MHC motif...

  11. Bounded search for de novo identification of degenerate cis-regulatory elements

    Directory of Open Access Journals (Sweden)

    Khetani Radhika S

    2006-05-01

    Full Text Available Abstract Background The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach. Results In this paper, we report PRISM, a degenerate motif finder that leverages the relationship between the statistical significance of a set of binding sites and that of the individual binding sites. PRISM first identifies overrepresented, non-degenerate consensus motifs, then iteratively relaxes each one into a high-scoring degenerate motif. This approach requires no tunable parameters, thereby lending itself to unbiased performance comparisons. We therefore compare PRISM's performance against nine popular motif finders on 28 well-characterized S. cerevisiae regulons. PRISM consistently outperforms all other programs. Finally, we use PRISM to predict the binding sites of uncharacterized regulons. Our results support a proposed mechanism of action for the yeast cell-cycle transcription factor Stb1, whose binding site has not been determined experimentally. Conclusion The relationship between statistical measures of the binding sites and the set as a whole leads to a simple means of identifying the diverse range of cis-regulatory elements to which a protein binds. This approach leverages the advantages of word-counting, in that position dependencies are implicitly accounted for and local optima are more easily avoided. While we sacrifice guaranteed optimality to prevent the exponential blowup of exhaustive search, we prove that the error

  12. Altered DNA methylation of glycolytic and lipogenic genes in liver from obese and type 2 diabetic patients

    DEFF Research Database (Denmark)

    Kirchner, Henriette; Sinha, Indranil; Gao, Hui

    2016-01-01

    OBJECTIVE: Epigenetic modifications contribute to the etiology of type 2 diabetes. METHOD: We performed genome-wide methylome and transcriptome analysis in liver from severely obese men with or without type 2 diabetes and non-obese men to discover aberrant pathways underlying the development...... in four of these genes in liver of severely obese non-diabetic and type 2 diabetic patients, suggesting epigenetic regulation of transcription by altered ATF-DNA binding. CONCLUSION: Severely obese non-diabetic and type 2 diabetic patients have distinct alterations in the hepatic methylome...... and transcriptome, with hypomethylation of several genes controlling glucose metabolism within the ATF-motif regulatory site. Obesity appears to shift the epigenetic program of the liver towards increased glycolysis and lipogenesis, which may exacerbate the development of insulin resistance....

  13. Motif discovery in ranked lists of sequences

    DEFF Research Database (Denmark)

    Nielsen, Morten Muhlig; Tataru, Paula; Madsen, Tobias

    2016-01-01

    Motif analysis has long been an important method to characterize biological functionality and the current growth of sequencing-based genomics experiments further extends its potential. These diverse experiments often generate sequence lists ranked by some functional property. There is therefore...... advantage of the regular expression feature, including enrichments for combinations of different microRNA seed sites. The method is implemented and made publicly available as an R package and supports high parallelization on multi-core machinery....... a growing need for motif analysis methods that can exploit this coupled data structure and be tailored for specific biological questions. Here, we present an exploratory motif analysis tool, Regmex (REGular expression Motif EXplorer), which offers several methods to evaluate the correlation of motifs...

  14. Transcription factor trapping by RNA in gene regulatory elements.

    Science.gov (United States)

    Sigova, Alla A; Abraham, Brian J; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M; Guo, Yang Eric; Jangi, Mohini; Giallourakis, Cosmas C; Sharp, Phillip A; Young, Richard A

    2015-11-20

    Transcription factors (TFs) bind specific sequences in promoter-proximal and -distal DNA elements to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF Yin-Yang 1 (YY1) binds to both gene regulatory elements and their associated RNA species across the entire genome. Reduced transcription of regulatory elements diminishes YY1 occupancy, whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive-feedback loop that contributes to the stability of gene expression programs. Copyright © 2015, American Association for the Advancement of Science.

  15. Deciphering functional glycosaminoglycan motifs in development.

    Science.gov (United States)

    Townley, Robert A; Bülow, Hannes E

    2018-03-23

    Glycosaminoglycans (GAGs) such as heparan sulfate, chondroitin/dermatan sulfate, and keratan sulfate are linear glycans, which when attached to protein backbones form proteoglycans. GAGs are essential components of the extracellular space in metazoans. Extensive modifications of the glycans such as sulfation, deacetylation and epimerization create structural GAG motifs. These motifs regulate protein-protein interactions and are thereby repsonsible for many of the essential functions of GAGs. This review focusses on recent genetic approaches to characterize GAG motifs and their function in defined signaling pathways during development. We discuss a coding approach for GAGs that would enable computational analyses of GAG sequences such as alignments and the computation of position weight matrices to describe GAG motifs. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Fitness for synchronization of network motifs

    DEFF Research Database (Denmark)

    Vega, Y.M.; Vázquez-Prada, M.; Pacheco, A.F.

    2004-01-01

    We study the synchronization of Kuramoto's oscillators in small parts of networks known as motifs. We first report on the system dynamics for the case of a scale-free network and show the existence of a non-trivial critical point. We compute the probability that network motifs synchronize, and fi...... that the fitness for synchronization correlates well with motifs interconnectedness and structural complexity. Possible implications for present debates about network evolution in biological and other systems are discussed....

  17. Novel Strategy for Discrimination of Transcription Factor Binding Motifs Employing Mathematical Neural Network

    Science.gov (United States)

    Sugimoto, Asuka; Sumi, Takuya; Kang, Jiyoung; Tateno, Masaru

    2017-07-01

    Recognition in biological macromolecular systems, such as DNA-protein recognition, is one of the most crucial problems to solve toward understanding the fundamental mechanisms of various biological processes. Since specific base sequences of genome DNA are discriminated by proteins, such as transcription factors (TFs), finding TF binding motifs (TFBMs) in whole genome DNA sequences is currently a central issue in interdisciplinary biophysical and information sciences. In the present study, a novel strategy to create a discriminant function for discrimination of TFBMs by constituting mathematical neural networks (NNs) is proposed, together with a method to determine the boundary of signals (TFBMs) and noise in the NN-score (output) space. This analysis also leads to the mathematical limitation of discrimination in the recognition of features representing TFBMs, in an information geometrical manifold. Thus, the present strategy enables the identification of the whole space of TFBMs, right up to the noise boundary.

  18. Flow Cytometry-Assisted Cloning of Specific Sequence Motifs from Complex 16S rRNA Gene Libraries

    DEFF Research Database (Denmark)

    Nielsen, Jeppe Lund; Schramm, Andreas; Bernhard, Anne E.

    2004-01-01

    for Systems Biology,3 Seattle, Washington, and Department of Ecological Microbiology, University of Bayreuth, Bayreuth, Germany2 A flow cytometry method was developed for rapid screening and recovery of cloned DNA containing common sequence motifs. This approach, termed fluorescence-activated cell sorting......  FLOW CYTOMETRY-ASSISTED CLONING OF SPECIFIC SEQUENCE MOTIFS FROM COMPLEX 16S RRNA GENE LIBRARIES Jeppe L. Nielsen,1 Andreas Schramm,1,2 Anne E. Bernhard,1 Gerrit J. van den Engh,3 and David A. Stahl1* Department of Civil and Environmental Engineering, University of Washington,1 and Institute......-assisted cloning, was used to recover sequences affiliated with a unique lineage within the Bacteroidetes not abundant in a clone library of environmental 16S rRNA genes.  ...

  19. Comparison of loline alkaloid gene clusters across fungal endophytes: predicting the co-regulatory sequence motifs and the evolutionary history.

    Science.gov (United States)

    Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H

    2007-10-01

    LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.

  20. Design of potent inhibitors of human RAD51 recombinase based on BRC motifs of BRCA2 protein: modeling and experimental validation of a chimera peptide.

    KAUST Repository

    Nomme, Julian; Renodon-Corniè re, Axelle; Asanomi, Yuya; Sakaguchi, Kazuyasu; Stasiak, Alicja Z; Stasiak, Andrzej; Norden, Bengt; Tran, Vinh; Takahashi, Masayuki

    2010-01-01

    We have previously shown that a 28-amino acid peptide derived from the BRC4 motif of BRCA2 tumor suppressor inhibits selectively human RAD51 recombinase (HsRad51). With the aim of designing better inhibitors for cancer treatment, we combined an in silico docking approach with in vitro biochemical testing to construct a highly efficient chimera peptide from eight existing human BRC motifs. We built a molecular model of all BRC motifs complexed with HsRad51 based on the crystal structure of the BRC4 motif-HsRad51 complex, computed the interaction energy of each residue in each BRC motif, and selected the best amino acid residue at each binding position. This analysis enabled us to propose four amino acid substitutions in the BRC4 motif. Three of these increased the inhibitory effect in vitro, and this effect was found to be additive. We thus obtained a peptide that is about 10 times more efficient in inhibiting HsRad51-ssDNA complex formation than the original peptide.

  1. Design of potent inhibitors of human RAD51 recombinase based on BRC motifs of BRCA2 protein: modeling and experimental validation of a chimera peptide.

    KAUST Repository

    Nomme, Julian

    2010-08-01

    We have previously shown that a 28-amino acid peptide derived from the BRC4 motif of BRCA2 tumor suppressor inhibits selectively human RAD51 recombinase (HsRad51). With the aim of designing better inhibitors for cancer treatment, we combined an in silico docking approach with in vitro biochemical testing to construct a highly efficient chimera peptide from eight existing human BRC motifs. We built a molecular model of all BRC motifs complexed with HsRad51 based on the crystal structure of the BRC4 motif-HsRad51 complex, computed the interaction energy of each residue in each BRC motif, and selected the best amino acid residue at each binding position. This analysis enabled us to propose four amino acid substitutions in the BRC4 motif. Three of these increased the inhibitory effect in vitro, and this effect was found to be additive. We thus obtained a peptide that is about 10 times more efficient in inhibiting HsRad51-ssDNA complex formation than the original peptide.

  2. Aplikasi Ornamen Khas Maluku untuk Pengembangan Desain Motif Batik

    Directory of Open Access Journals (Sweden)

    Masiswo Masiswo

    2016-04-01

    Full Text Available ABSTRAKMaluku memiliki banyak ragam hias budaya warisan nilai leluhur berupa ornamen etnis yang merupakan kesenian dan keterampilan kerajinan. Hasil warisan tersebut sampai saat ini masih lestari hidup serta dapat dinikmati sebagai konsumsi rohani yang memuaskan manusia. Berkaitan dengan keberlangsungan nilai-nilai tradisi etnis yang berwujud pada ornamen-ornamen daerah Maluku, maka dikembangkan untuk kebutuhan manusia berupa motif batik pada kain. Pengembangan ornamen ini lebih menekankan pada representasi akan bentuk-bentuk ornamen yang diterapkan pada kerajinan batik berupa motif khas Maluku. Pengembangan alternatif desain motif batik dibuat tiga variasi yang bersumber dari ornamen khas Maluku dibuat prototipe produknya dan diuji ketahanan luntur warnanya. Hasil uji ketahanan luntur warna terhadap gosokan basah dari tiga prototipe produk berpredikat baik sekali terdapat pada “Motif Siwa” dan predikat baik pada motif “Siwa Talang” dan motif “Matahari Siwa Talang”.Kata kunci: desain, Maluku, motif batik, ornamenABSTRACTMaluku has much decorative ancestral cultural heritage value in the form of ornament ethnic arts and crafts skills. The result of the legacy is still sustainable living can be enjoyed as well as satisfying spiritual human consumption.Related to the sustainability of traditional values in the form of ethnic ornaments Maluku, it was developed for human needs in the form of batik cloth . The development of these ornaments will be more emphasis on the representation forms of ornamentation that is applied to a batik motif Maluku. Development of alternative design motif made three variations. The development of three alternative design motifs derived from the Maluku ornaments made and tested a prototype product color fastness. The test results of color fastness to wet rubbing of the three prototypes are excellent products predicated on the "Motif Siwa" and a good rating on the motif "Siwa Talang" and motif "Matahari Siwa

  3. Control of DEMETER DNA demethylase gene transcription in male and female gamete companion cells in Arabidopsis thaliana.

    Science.gov (United States)

    Park, Jin-Sup; Frost, Jennifer M; Park, Kyunghyuk; Ohr, Hyonhwa; Park, Guen Tae; Kim, Seohyun; Eom, Hyunjoo; Lee, Ilha; Brooks, Janie S; Fischer, Robert L; Choi, Yeonhee

    2017-02-21

    The DEMETER (DME) DNA glycosylase initiates active DNA demethylation via the base-excision repair pathway and is vital for reproduction in Arabidopsis thaliana DME-mediated DNA demethylation is preferentially targeted to small, AT-rich, and nucleosome-depleted euchromatic transposable elements, influencing expression of adjacent genes and leading to imprinting in the endosperm. In the female gametophyte, DME expression and subsequent genome-wide DNA demethylation are confined to the companion cell of the egg, the central cell. Here, we show that, in the male gametophyte, DME expression is limited to the companion cell of sperm, the vegetative cell, and to a narrow window of time: immediately after separation of the companion cell lineage from the germline. We define transcriptional regulatory elements of DME using reporter genes, showing that a small region, which surprisingly lies within the DME gene, controls its expression in male and female companion cells. DME expression from this minimal promoter is sufficient to rescue seed abortion and the aberrant DNA methylome associated with the null dme-2 mutation. Within this minimal promoter, we found short, conserved enhancer sequences necessary for the transcriptional activities of DME and combined predicted binding motifs with published transcription factor binding coordinates to produce a list of candidate upstream pathway members in the genetic circuitry controlling DNA demethylation in gamete companion cells. These data show how DNA demethylation is regulated to facilitate endosperm gene imprinting and potential transgenerational epigenetic regulation, without subjecting the germline to potentially deleterious transposable element demethylation.

  4. Parole, Sintagmatik, dan Paradigmatik Motif Batik Mega Mendung

    Directory of Open Access Journals (Sweden)

    Rudi - Nababan

    2012-04-01

    Full Text Available ABSTRACT   Discussing traditional batik is related a lot to the organization system of fine arts element ac- companying it, either the pattern of the motif or the technique of the making. In this case, the motif of Mega Mendung Cirebon certainly has patterns and rules which are traditionally different from the other motifs in other areas. Through  semiotics analysis especially with Saussure and Pierce concept, it can be traced that batik with Cirebon motif, in this case Mega Mendung motif, has parole and langue system, as unique fine arts language in batik, and structure of visual syntagmatic and paradigmatic. In the context of batik motif as fine arts language, it is surely related to sign system as symbol and icon.       Keywords: visual semiotic, Cirebon’s batik.

  5. Zinc fingers, zinc clusters, and zinc twists in DNA-binding protein domains

    International Nuclear Information System (INIS)

    Vallee, B.L.; Auld, D.S.; Coleman, J.E.

    1991-01-01

    The authors recognize three distinct motifs of DNA-binding zinc proteins: (i) zinc fingers, (ii) zinc clusters, and (iii) zinc twists. Until very recently, x-ray crystallographic or NMR three-dimensional structure analyses of DNA-binding zinc proteins have not been available to serve as standards of reference for the zinc binding sites of these families of proteins. Those of the DNA-binding domains of the fungal transcription factor GAL4 and the rat glucocorticoid receptor are the first to have been determined. Both proteins contain two zinc binding sites, and in both, cysteine residues are the sole zinc ligands. In GAL4, two zinc atoms are bound to six cysteine residues which form a zinc cluster akin to that of metallothionein; the distance between the two zinc atoms of GAL4 is ∼3.5 angstrom. In the glucocorticoid receptor, each zinc atom is bound to four cysteine residues; the interatomic zinc-zinc distance is ∼13 angstrom, and in this instance, a zinc twist is represented by a helical DNA recognition site located between the two zinc atoms. Zinc clusters and zinc twists are here recognized as two distinctive motifs in DNA-binding proteins containing multiple zinc atoms. For native zinc fingers, structural data do not exist as yet; consequently, the interatomic distances between zinc atoms are not known. As further structural data become available, the structural and functional significance of these different motifs in their binding to DNA and other proteins participating in the transmission of the genetic message will become apparent

  6. Exploring the roles of DNA methylation in the metal-reducing bacterium Shewanella oneidensis MR-1

    Energy Technology Data Exchange (ETDEWEB)

    Bendall, Matthew L. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Luong, Khai [Pacific Biosciences, Menlo Park, CA (United States); Wetmore, Kelly M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Blow, Matthew [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Korlach, Jonas [Pacific Biosciences, Menlo Park, CA (United States); Deutschbauer, Adam [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Malmstrom, Rex [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2013-08-30

    We performed whole genome analyses of DNA methylation in Shewanella 17 oneidensis MR-1 to examine its possible role in regulating gene expression and 18 other cellular processes. Single-Molecule Real Time (SMRT) sequencing 19 revealed extensive methylation of adenine (N6mA) throughout the 20 genome. These methylated bases were located in five sequence motifs, 21 including three novel targets for Type I restriction/modification enzymes. The 22 sequence motifs targeted by putative methyltranferases were determined via 23 SMRT sequencing of gene knockout mutants. In addition, we found S. 24 oneidensis MR-1 cultures grown under various culture conditions displayed 25 different DNA methylation patterns. However, the small number of differentially 26 methylated sites could not be directly linked to the much larger number of 27 differentially expressed genes in these conditions, suggesting DNA methylation is 28 not a major regulator of gene expression in S. oneidensis MR-1. The enrichment 29 of methylated GATC motifs in the origin of replication indicate DNA methylation 30 may regulate genome replication in a manner similar to that seen in Escherichia 31 coli. Furthermore, comparative analyses suggest that many 32 Gammaproteobacteria, including all members of the Shewanellaceae family, may 33 also utilize DNA methylation to regulate genome replication.

  7. DNA nanotechnology: a future perspective

    Science.gov (United States)

    2013-01-01

    In addition to its genetic function, DNA is one of the most distinct and smart self-assembling nanomaterials. DNA nanotechnology exploits the predictable self-assembly of DNA oligonucleotides to design and assemble innovative and highly discrete nanostructures. Highly ordered DNA motifs are capable of providing an ultra-fine framework for the next generation of nanofabrications. The majority of these applications are based upon the complementarity of DNA base pairing: adenine with thymine, and guanine with cytosine. DNA provides an intelligent route for the creation of nanoarchitectures with programmable and predictable patterns. DNA strands twist along one helix for a number of bases before switching to the other helix by passing through a crossover junction. The association of two crossovers keeps the helices parallel and holds them tightly together, allowing the assembly of bigger structures. Because of the DNA molecule's unique and novel characteristics, it can easily be applied in a vast variety of multidisciplinary research areas like biomedicine, computer science, nano/optoelectronics, and bionanotechnology. PMID:23497147

  8. Regulatory elements involved in tax-mediated transactivation of the HTLV-I LTR.

    Science.gov (United States)

    Seeler, J S; Muchardt, C; Podar, M; Gaynor, R B

    1993-10-01

    HTLV-I is the etiologic agent of adult T-cell leukemia. In this study, we investigated the regulatory elements and cellular transcription factors which function in modulating HTLV-I gene expression in response to the viral transactivator protein, tax. Transfection experiments into Jurkat cells of a variety of site-directed mutants in the HTLV-1 LTR indicated that each of the three motifs A, B, and C within the 21-bp repeats, the binding sites for the Ets family of proteins, and the TATA box all influenced the degree of tax-mediated activation. Tax is also able to activate gene expression of other viral and cellular promoters. Tax activation of the IL-2 receptor and the HIV-1 LTR is mediated through NF-kappa B motifs. Interestingly, sequences in the 21-bp repeat B and C motifs contain significant homology with NF-kappa B regulatory elements. We demonstrated that an NF-kappa B binding protein, PRDII-BF1, but not the rel protein, bound to the B and C motifs in the 21-bp repeat. PRDII-BF1 was also able to stimulate activation of HTLV-I gene expression by tax. The role of the Ets proteins on modulating tax activation was also studied. Ets 1 but not Ets 2 was capable of increasing the degree of tax activation of the HTLV-I LTR. These results suggest that tax activates gene expression by either direct or indirect interaction with several cellular transcription factors that bind to the HTLV-I LTR.

  9. Identification of a phosphorylation-dependent nuclear localization motif in interferon regulatory factor 2 binding protein 2.

    Directory of Open Access Journals (Sweden)

    Allen C T Teng

    Full Text Available Interferon regulatory factor 2 binding protein 2 (IRF2BP2 is a muscle-enriched transcription factor required to activate vascular endothelial growth factor-A (VEGFA expression in muscle. IRF2BP2 is found in the nucleus of cardiac and skeletal muscle cells. During the process of skeletal muscle differentiation, some IRF2BP2 becomes relocated to the cytoplasm, although the functional significance of this relocation and the mechanisms that control nucleocytoplasmic localization of IRF2BP2 are not yet known.Here, by fusing IRF2BP2 to green fluorescent protein and testing a series of deletion and site-directed mutagenesis constructs, we mapped the nuclear localization signal (NLS to an evolutionarily conserved sequence (354ARKRKPSP(361 in IRF2BP2. This sequence corresponds to a classical nuclear localization motif bearing positively charged arginine and lysine residues. Substitution of arginine and lysine with negatively charged aspartic acid residues blocked nuclear localization. However, these residues were not sufficient because nuclear targeting of IRF2BP2 also required phosphorylation of serine 360 (S360. Many large-scale phosphopeptide proteomic studies had reported previously that serine 360 of IRF2BP2 is phosphorylated in numerous human cell types. Alanine substitution at this site abolished IRF2BP2 nuclear localization in C(2C(12 myoblasts and CV1 cells. In contrast, substituting serine 360 with aspartic acid forced nuclear retention and prevented cytoplasmic redistribution in differentiated C(2C(12 muscle cells. As for the effects of these mutations on VEGFA promoter activity, the S360A mutation interfered with VEGFA activation, as expected. Surprisingly, the S360D mutation also interfered with VEGFA activation, suggesting that this mutation, while enforcing nuclear entry, may disrupt an essential activation function of IRF2BP2.Nuclear localization of IRF2BP2 depends on phosphorylation near a conserved NLS. Changes in phosphorylation status

  10. Suppressive oligodeoxynucleotides containing TTAGGG motifs inhibit cGAS activation in human monocytes.

    Science.gov (United States)

    Steinhagen, Folkert; Zillinger, Thomas; Peukert, Konrad; Fox, Mario; Thudium, Marcus; Barchet, Winfried; Putensen, Christian; Klinman, Dennis; Latz, Eicke; Bode, Christian

    2018-04-01

    Type I interferon (IFN) is a critical mediator of autoimmune diseases such as systemic lupus erythematosus (SLE) and Aicardi-Goutières Syndrome (AGS). The recently discovered cyclic-GMP-AMP (cGAMP) synthase (cGAS) induces the production of type I IFN in response to cytosolic DNA and is potentially linked to SLE and AGS. Suppressive oligodeoxynucleotides (ODN) containing repetitive TTAGGG motifs present in mammalian telomeres have proven useful in the treatment of autoimmune diseases including SLE. In this study, we demonstrate that the suppressive ODN A151 effectively inhibits activation of cGAS in response to cytosolic DNA, thereby inhibiting type I IFN production by human monocytes. In addition, A151 abrogated cGAS activation in response to endogenous accumulation of DNA using TREX1-deficient monocytes. We demonstrate that A151 prevents cGAS activation in a manner that is competitive with DNA. This suppressive activity of A151 was dependent on both telomeric sequence and phosphorothioate backbone. To our knowledge this report presents the first cGAS inhibitor capable of blocking self-DNA. Collectively, these findings might lead to the development of new therapeutics against IFN-driven pathologies due to cGAS activation. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Motif statistics and spike correlations in neuronal networks

    International Nuclear Information System (INIS)

    Hu, Yu; Shea-Brown, Eric; Trousdale, James; Josić, Krešimir

    2013-01-01

    Motifs are patterns of subgraphs of complex networks. We studied the impact of such patterns of connectivity on the level of correlated, or synchronized, spiking activity among pairs of cells in a recurrent network of integrate and fire neurons. For a range of network architectures, we find that the pairwise correlation coefficients, averaged across the network, can be closely approximated using only three statistics of network connectivity. These are the overall network connection probability and the frequencies of two second order motifs: diverging motifs, in which one cell provides input to two others, and chain motifs, in which two cells are connected via a third intermediary cell. Specifically, the prevalence of diverging and chain motifs tends to increase correlation. Our method is based on linear response theory, which enables us to express spiking statistics using linear algebra, and a resumming technique, which extrapolates from second order motifs to predict the overall effect of coupling on network correlation. Our motif-based results seek to isolate the effect of network architecture perturbatively from a known network state. (paper)

  12. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

    Science.gov (United States)

    Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

    2005-09-01

    We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.

  13. Bayesian centroid estimation for motif discovery.

    Science.gov (United States)

    Carvalho, Luis

    2013-01-01

    Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.

  14. Bayesian centroid estimation for motif discovery.

    Directory of Open Access Journals (Sweden)

    Luis Carvalho

    Full Text Available Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.

  15. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

    Directory of Open Access Journals (Sweden)

    Rahul Karnik

    Full Text Available The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.

  16. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease.

    Science.gov (United States)

    Anders, Carolin; Niewoehner, Ole; Duerst, Alessia; Jinek, Martin

    2014-09-25

    The CRISPR-associated protein Cas9 is an RNA-guided endonuclease that cleaves double-stranded DNA bearing sequences complementary to a 20-nucleotide segment in the guide RNA. Cas9 has emerged as a versatile molecular tool for genome editing and gene expression control. RNA-guided DNA recognition and cleavage strictly require the presence of a protospacer adjacent motif (PAM) in the target DNA. Here we report a crystal structure of Streptococcus pyogenes Cas9 in complex with a single-molecule guide RNA and a target DNA containing a canonical 5'-NGG-3' PAM. The structure reveals that the PAM motif resides in a base-paired DNA duplex. The non-complementary strand GG dinucleotide is read out via major-groove interactions with conserved arginine residues from the carboxy-terminal domain of Cas9. Interactions with the minor groove of the PAM duplex and the phosphodiester group at the +1 position in the target DNA strand contribute to local strand separation immediately upstream of the PAM. These observations suggest a mechanism for PAM-dependent target DNA melting and RNA-DNA hybrid formation. Furthermore, this study establishes a framework for the rational engineering of Cas9 enzymes with novel PAM specificities.

  17. An efficient identification strategy of clonal tea cultivars using long-core motif SSR markers.

    Science.gov (United States)

    Wang, Rang Jian; Gao, Xiang Feng; Kong, Xiang Rui; Yang, Jun

    2016-01-01

    Microsatellites, or simple sequence repeats (SSRs), especially those with long-core motifs (tri-, tetra-, penta-, and hexa-nucleotide) represent an excellent tool for DNA fingerprinting. SSRs with long-core motifs are preferred since neighbor alleles are more easily separated and identified from each other, which render the interpretation of electropherograms and the true alleles more reliable. In the present work, with the purpose of characterizing a set of core SSR markers with long-core motifs for well fingerprinting clonal cultivars of tea (Camellia sinensis), we analyzed 66 elite clonal tea cultivars in China with 33 initially-chosen long-core motif SSR markers covering all the 15 linkage groups of tea plant genome. A set of 6 SSR markers were conclusively selected as core SSR markers after further selection. The polymorphic information content (PIC) of the core SSR markers was >0.5, with ≤5 alleles in each marker containing 10 or fewer genotypes. Phylogenetic analysis revealed that the core SSR markers were not strongly correlated with the trait 'cultivar processing-property'. The combined probability of identity (PID) between two random cultivars for the whole set of 6 SSR markers was estimated to be 2.22 × 10(-5), which was quite low, confirmed the usefulness of the proposed SSR markers for fingerprinting analyses in Camellia sinensis. Moreover, for the sake of quickly discriminating the clonal tea cultivars, a cultivar identification diagram (CID) was subsequently established using these core markers, which fully reflected the identification process and provided the immediate information about which SSR markers were needed to identify a cultivar chosen among the tested ones. The results suggested that long-core motif SSR markers used in the investigation contributed to the accurate and efficient identification of the clonal tea cultivars and enabled the protection of intellectual property.

  18. Mutations in the putative zinc-binding motif of UL52 demonstrate a complex interdependence between the UL5 and UL52 subunits of the human herpes simplex virus type 1 helicase/primase complex.

    Science.gov (United States)

    Chen, Yan; Carrington-Lawrence, Stacy D; Bai, Ping; Weller, Sandra K

    2005-07-01

    Herpes simplex virus type 1 (HSV-1) encodes a heterotrimeric helicase-primase (UL5/8/52) complex. UL5 contains seven motifs found in helicase superfamily 1, and UL52 contains conserved motifs found in primases. The contributions of each subunit to the biochemical activities of the complex, however, remain unclear. We have previously demonstrated that a mutation in the putative zinc finger at UL52 C terminus abrogates not only primase but also ATPase, helicase, and DNA-binding activities of a UL5/UL52 subcomplex, indicating a complex interdependence between the two subunits. To test this hypothesis and to further investigate the role of the zinc finger in the enzymatic activities of the helicase-primase, a series of mutations were constructed in this motif. They differed in their ability to complement a UL52 null virus: totally defective, partial complementation, and potentiating. In this study, four of these mutants were studied biochemically after expression and purification from insect cells infected with recombinant baculoviruses. All mutants show greatly reduced primase activity. Complementation-defective mutants exhibited severe defects in ATPase, helicase, and DNA-binding activities. Partially complementing mutants displayed intermediate levels of these activities, except that one showed a wild-type level of helicase activity. These data suggest that the UL52 zinc finger motif plays an important role in the activities of the helicase-primase complex. The observation that mutations in UL52 affected helicase, ATPase, and DNA-binding activities indicates that UL52 binding to DNA via the zinc finger may be necessary for loading UL5. Alternatively, UL5 and UL52 may share a DNA-binding interface.

  19. Controllability analysis of transcriptional regulatory networks reveals circular control patterns among transcription factors

    DEFF Research Database (Denmark)

    Österlund, Tobias; Bordel, Sergio; Nielsen, Jens

    2015-01-01

    % for the human network. The high controllability (low number of drivers needed to control the system) in yeast, mouse and human is due to the presence of internal loops in their regulatory networks where the TFs regulate each other in a circular fashion. We refer to these internal loops as circular control...... motifs (CCM). The E. coli transcriptional regulatory network, which does not have any CCMs, shows a hierarchical structure of the transcriptional regulatory network in contrast to the eukaryal networks. The presence of CCMs also has influence on the stability of these networks, as the presence of cycles...

  20. A Parzen window-based approach for the detection of locally enriched transcription factor binding sites.

    Science.gov (United States)

    Vandenbon, Alexis; Kumagai, Yutaro; Teraguchi, Shunsuke; Amada, Karlou Mar; Akira, Shizuo; Standley, Daron M

    2013-01-21

    Identification of cis- and trans-acting factors regulating gene expression remains an important problem in biology. Bioinformatics analyses of regulatory regions are hampered by several difficulties. One is that binding sites for regulatory proteins are often not significantly over-represented in the set of DNA sequences of interest, because of high levels of false positive predictions, and because of positional restrictions on functional binding sites with regard to the transcription start site. We have developed a novel method for the detection of regulatory motifs based on their local over-representation in sets of regulatory regions. The method makes use of a Parzen window-based approach for scoring local enrichment, and during evaluation of significance it takes into account GC content of sequences. We show that the accuracy of our method compares favourably to that of other methods, and that our method is capable of detecting not only generally over-represented regulatory motifs, but also locally over-represented motifs that are often missed by standard motif detection approaches. Using a number of examples we illustrate the validity of our approach and suggest applications, such as the analysis of weaker binding sites. Our approach can be used to suggest testable hypotheses for wet-lab experiments. It has potential for future analyses, such as the prediction of weaker binding sites. An online application of our approach, called LocaMo Finder (Local Motif Finder), is available at http://sysimm.ifrec.osaka-u.ac.jp/tfbs/locamo/.

  1. Interaction of Cu+ with cytosine and formation of i-motif-like C-M+-C complexes: alkali versus coinage metals

    NARCIS (Netherlands)

    Gao, J.; Berden, G.; Rodgers, M.T.; Oomens, J.

    2016-01-01

    The Watson-Crick structure of DNA is among the most well-known molecular structures of our time. However, alternative base-pairing motifs are also known to occur, often depending on base sequence, pH, or the presence of cations. Pairing of cytosine (C) bases induced by the sharing of a single proton

  2. Biophysical properties of regions flanking the bHLH-Zip motif in the p22 Max protein

    International Nuclear Information System (INIS)

    Pursglove, Sharon E.; Fladvad, Malin; Bellanda, Massimo; Moshref, Ahmad; Henriksson, Marie; Carey, Jannette; Sunnerhagen, Maria

    2004-01-01

    The Max protein is the central dimerization partner in the Myc-Max-Mad network of transcriptional regulators, and a founding structural member of the family of basic-helix-loop-helix (bHLH)-leucine zipper (Zip) proteins. Biologically important regions flanking its bHLH-Zip motif have been disordered or absent in crystal structures. The present study shows that these regions are resistant to proteolysis in both the presence and absence of DNA, and that Max dimers containing both flanking regions have significantly higher helix content as measured by circular dichroism than that predicted from the crystal structures. Nuclear magnetic resonance measurements in the absence of DNA also support the inferred structural order. Deletion of both flanking regions is required to achieve maximal DNA affinity as measured by EMSA. Thus, the previously observed functionalities of these Max regions in DNA binding, phosphorylation, and apoptosis are suggested to be linked to structural properties

  3. The primary structure of L37--a rat ribosomal protein with a zinc finger-like motif.

    Science.gov (United States)

    Chan, Y L; Paz, V; Olvera, J; Wool, I G

    1993-04-30

    The amino acid sequence of the rat 60S ribosomal subunit protein L37 was deduced from the sequence of nucleotides in a recombinant cDNA. Ribosomal protein L37 has 96 amino acids, the NH2-terminal methionine is removed after translation of the mRNA, and has a molecular weight of 10,939. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type. Hybridization of the cDNA to digests of nuclear DNA suggests that there are 13 or 14 copies of the L37 gene. The mRNA for the protein is about 500 nucleotides in length. Rat L37 is related to Saccharomyces cerevisiae ribosomal protein YL35 and to Caenorhabditis elegans L37. We have identified in the data base a DNA sequence that encodes the chicken homolog of rat L37.

  4. CONTEMPORARY USAGE OF TRADITIONAL TURKISH MOTIFS IN PRODUCT DESIGNS

    Directory of Open Access Journals (Sweden)

    Tulay Gumuser

    2012-12-01

    Full Text Available The aim of this study is to identify the traditional Turkish motifs and its relations among present industrial designs. Traditional Turkish motifs played a very important role in 16th century onwards. The arts of the Ottoman Empire were used because of their symbolic meanings and unique styles. When we examine these motifs we encounter; Tiger Stripe, Three Spot (Çintemani, Rumi, Hatayi, Penç, Cloud, Crescent, Star, Crown, Hyacinth, Tulip and Carnation motifs. Nowadays, Turkish designers have begun to use these traditional Turkish motifs in their designs so as to create differences and awareness in the world design. The examples of these industrial designs, using the Turkish motifs, have survived and have Ottoman heritage and historical value. In this study, the Turkish motifs will be examined along with their focus on contemporary Turkish industrial designs used today.

  5. A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

    Directory of Open Access Journals (Sweden)

    Haitao Guo

    2017-01-01

    Full Text Available The discovery of cis-regulatory modules (CRMs is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them.

  6. A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

    Science.gov (United States)

    2017-01-01

    The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them. PMID:28497059

  7. Isolation of deletion alleles by G4 DNA-induced mutagenesis

    NARCIS (Netherlands)

    Pontier, Daphne B; Kruisselbrink, Evelien; Guryev, Victor; Tijsterman, Marcel

    Metazoan genomes contain thousands of sequence tracts that match the guanine-quadruplex (G4) DNA signature G(3)N(x)G(3)N(x)G(3)N(x)G(3), a motif that is intrinsically mutagenic, probably because it can form secondary structures during DNA replication. Here we show how and to what extent this feature

  8. The human Ago2 MC region does not contain an eIF4E-like mRNA cap binding motif

    Directory of Open Access Journals (Sweden)

    Grishin Nick V

    2009-01-01

    Full Text Available Abstract Background Argonaute (Ago proteins interact with small regulatory RNAs to mediate gene regulatory pathways. A recent report by Kiriakidou et al. 1 describes an MC sequence region identified in Ago2 that displays similarity to the cap-binding motif in translation initiation factor 4E (eIF4E. In a cap-bound eIF4E structure, two important aromatic residues of the motif stack on either side of a 7-methylguanosine 5'-triphosphate (m7Gppp base. The corresponding Ago2 aromatic residues (F450 and F505 were hypothesized to perform the same cap-binding function. However, the detected similarity between the MC sequence and the eIF4E cap-binding motif was questionable. Results A number of sequence-based and structure-based bioinformatics methods reveal the reported similarity between the Ago2 MC sequence region and the eIF4E cap-binding motif to be spurious. Alternatively, the MC sequence region is confidently assigned to the N-terminus of the Ago piwi module, within the mid domain of experimentally determined prokaryotic Ago structures. Confident mapping of the Ago2 MC sequence region to the piwi mid domain results in a homology-based structure model that positions the identified aromatic residues over 20 Å apart, with one of the aromatic side chains (F450 contributing instead to the hydrophobic core of the domain. Conclusion Correct functional prediction based on weak sequence similarity requires substantial evolutionary and structural support. The evolutionary context of the Ago mid domain suggested by multiple sequence alignment is limited to a conserved hydrophobicity profile required for the fold and a motif following the MC region that binds guide RNA. Mapping of the MC sequence to the mid domain structure reveals Ago2 aromatics that are incompatible with eIF4E-like mRNA cap-binding, yet display some limited local structure similarities that cause the chance sequence match to eIF4E. Reviewers This article was reviewed by Arcady Mushegian

  9. A systems biology approach to transcription factor binding site prediction.

    Directory of Open Access Journals (Sweden)

    Xiang Zhou

    2010-03-01

    Full Text Available The elucidation of mammalian transcriptional regulatory networks holds great promise for both basic and translational research and remains one the greatest challenges to systems biology. Recent reverse engineering methods deduce regulatory interactions from large-scale mRNA expression profiles and cross-species conserved regulatory regions in DNA. Technical challenges faced by these methods include distinguishing between direct and indirect interactions, associating transcription regulators with predicted transcription factor binding sites (TFBSs, identifying non-linearly conserved binding sites across species, and providing realistic accuracy estimates.We address these challenges by closely integrating proven methods for regulatory network reverse engineering from mRNA expression data, linearly and non-linearly conserved regulatory region discovery, and TFBS evaluation and discovery. Using an extensive test set of high-likelihood interactions, which we collected in order to provide realistic prediction-accuracy estimates, we show that a careful integration of these methods leads to significant improvements in prediction accuracy. To verify our methods, we biochemically validated TFBS predictions made for both transcription factors (TFs and co-factors; we validated binding site predictions made using a known E2F1 DNA-binding motif on E2F1 predicted promoter targets, known E2F1 and JUND motifs on JUND predicted promoter targets, and a de novo discovered motif for BCL6 on BCL6 predicted promoter targets. Finally, to demonstrate accuracy of prediction using an external dataset, we showed that sites matching predicted motifs for ZNF263 are significantly enriched in recent ZNF263 ChIP-seq data.Using an integrative framework, we were able to address technical challenges faced by state of the art network reverse engineering methods, leading to significant improvement in direct-interaction detection and TFBS-discovery accuracy. We estimated the accuracy

  10. Identification of sequence motifs significantly associated with antisense activity

    Directory of Open Access Journals (Sweden)

    Peek Andrew S

    2007-06-01

    Full Text Available Abstract Background Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. Results We discovered 155 motifs that associate significantly with high antisense suppression activity and 202 motifs that associate significantly with low suppression activity. The motifs range in length from 2 to 5 bases, contain several motifs that have been previously discovered as associating highly with antisense activity, and have thermodynamic properties consistent with previous work associating thermodynamic properties of sequences with their antisense activity. Statistical analysis revealed no correlation between a motif's position within an antisense sequence and that sequences antisense activity. Also, many significant motifs existed as subwords of other significant motifs. Support vector regression experiments indicated that the feature set of significant motifs increased correlation compared to all possible motifs as well as several subsets of the significant motifs. Conclusion The thermodynamic properties of the significantly associated motifs support existing data correlating the thermodynamic properties of the antisense oligonucleotide with antisense efficiency, reinforcing our hypothesis that antisense suppression is strongly associated with probe/target thermodynamics, as there are no enzymatic

  11. The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

    Science.gov (United States)

    Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

    2017-09-21

    The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.

  12. Radiation and desiccation response motif mediates radiation induced gene expression in D. radiodurans

    International Nuclear Information System (INIS)

    Anaganti, Narasimha; Basu, Bhakti; Apte, Shree Kumar

    2015-01-01

    Deinococcus radiodurans is an extremophile that withstands lethal doses of several DNA damaging agents such as gamma irradiation, UV rays, desiccation and chemical mutagens. The organism responds to DNA damage by inducing expression of several DNA repair genes. At least 25 radiation inducible gene promoters harbour a 17 bp palindromic sequence known as radiation and desiccation response motif (RDRM) implicated in gamma radiation inducible gene expression. However, mechanistic details of gamma radiation-responsive up-regulation in gene expression remain enigmatic. The promoters of highly radiation induced genes ddrB (DR0070), gyrB (DR0906), gyrA (DR1913), a hypothetical gene (DR1143) and recA (DR2338) from D. radiodurans were cloned in a green fluorescence protein (GFP)-based promoter probe shuttle vector pKG and their promoter activity was assessed in both E. coli as well as in D. radiodurans. The gyrA, gyrB and DR1143 gene promoters were active in E. coli although ddrB and recA promoters showed very weak activity. In D. radiodurans, all the five promoters were induced several fold following 6 kGy gamma irradiation. Highest induction was observed for ddrB promoter (25 fold), followed by DR1143 promoter (15 fold). The induction in the activity of gyrB, gyrA and recA promoters was 5, 3 and 2 fold, respectively. To assess the role of RDRM, the 17 bp palindromic sequence was deleted from these promoters. The promoters devoid of RDRM sequence displayed increase in the basal expression activity, but the radiation-responsive induction in promoter activity was completely lost. The substitution of two conserved bases of RDRM sequence yielded decreased radiation induction of PDR0070 promoter. Deletion of 5 bases from 5'-end of PDR0070 RDRM increased basal promoter activity, but radiation induction was completely abolished. Replacement of RDRM with non specific sequence of PDR0070 resulted in loss of basal expression and radiation induction. The results demonstrate that

  13. The architecture of ArgR-DNA complexes at the genome-scale in Escherichia coli

    DEFF Research Database (Denmark)

    Cho, Suhyung; Cho, Yoo-Bok; Kang, Taek Jin

    2015-01-01

    DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA co...

  14. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  15. The transcriptional and gene regulatory network of Lactococcus lactis MG1363 during growth in milk.

    Directory of Open Access Journals (Sweden)

    Anne de Jong

    Full Text Available In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks.

  16. Analisis Unsur Matematika pada Motif Sulam Usus

    Directory of Open Access Journals (Sweden)

    Fredi Ganda Putra

    2017-12-01

    Full Text Available Based on interviews with researchers sources said that the beginning of the intestine embroidery is an art of genuine crafts. Called the intestine embroidery because this technique is a technique of combining a strand of cloth resembling the intestine formed according to the pattern by means of embroidered using a thread. Intestinal embroidery techniques were originally used to create a cover of the women's customary wardrobe of Lampung or often referred to as bebe. But not many people in Lampung, especially people who live in Lampung are still many who do not know and recognize the intestine embroidery because most only know tapis only characteristic of Lampung, besides that there are other cultural results that is embroidered intestine. There are still many who do not know that the intestine motif there is a knowledge of mathematics. The researcher's problem formulation is whether there are mathematical elements contained in the intestine embroidery motif based on the concept of geometry. The purpose of this study is to determine whether there are elements of mathematics contained in the intestine motif based on the concept of geometry. Subjects in this study consisted of 4 people obtained by purposive sampling technique. From the results of data analysis conducted by using descriptive analysis and discussion as follows: (1 Intestinal embroidery motif contains the meaning of mathematics and culture or often called Etnomatematika. On the meaning of culture there is a link between the embroidery intestine with a culture that has been there before as the existence of cultural linkage between Hindu belief Buddhism and there are similarities of motifs and decorative patterns contained in the motif embroidery intestine with ornamental variety in Indonesia. (2 The relationship between the intestine with mathematical motifs there are elements of mathematics such as geometry elements in the form of geometry of dimension one and dimension two, and the

  17. Translational Control of Host Gene Expression by a Cys-Motif Protein Encoded in a Bracovirus.

    Directory of Open Access Journals (Sweden)

    Eunseong Kim

    Full Text Available Translational control is a strategy that various viruses use to manipulate their hosts to suppress acute antiviral response. Polydnaviruses, a group of insect double-stranded DNA viruses symbiotic to some endoparasitoid wasps, are divided into two genera: ichnovirus (IV and bracovirus (BV. In IV, some Cys-motif genes are known as host translation-inhibitory factors (HTIF. The genome of endoparasitoid wasp Cotesia plutellae contains a Cys-motif gene (Cp-TSP13 homologous to an HTIF known as teratocyte-secretory protein 14 (TSP14 of Microplitis croceipes. Cp-TSP13 consists of 129 amino acid residues with a predicted molecular weight of 13.987 kDa and pI value of 7.928. Genomic DNA region encoding its open reading frame has three introns. Cp-TSP13 possesses six conserved cysteine residues as other Cys-motif genes functioning as HTIF. Cp-TSP13 was expressed in Plutella xylostella larvae parasitized by C. plutellae. C. plutellae bracovirus (CpBV was purified and injected into non-parasitized P. xylostella that expressed Cp-TSP13. Cp-TSP13 was cloned into a eukaryotic expression vector and used to infect Sf9 cells to transiently express Cp-TSP13. The synthesized Cp-TSP13 protein was detected in culture broth. An overlaying experiment showed that the purified Cp-TSP13 entered hemocytes. It was localized in the cytosol. Recombinant Cp-TSP13 significantly inhibited protein synthesis of secretory proteins when it was added to in vitro cultured fat body. In addition, the recombinant Cp-TSP13 directly inhibited the translation of fat body mRNAs in in vitro translation assay using rabbit reticulocyte lysate. Moreover, the recombinant Cp-TSP13 significantly suppressed cellular immune responses by inhibiting hemocyte-spreading behavior. It also exhibited significant insecticidal activities by both injection and feeding routes. These results indicate that Cp-TSP13 is a viral HTIF.

  18. Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

    Science.gov (United States)

    Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

    2014-01-01

    Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994

  19. DNA-binding properties of the Bacillus subtilis and Aeribacillus pallidus AC6 σ(D) proteins.

    Science.gov (United States)

    Sevim, Elif; Gaballa, Ahmed; Beldüz, A Osman; Helmann, John D

    2011-01-01

    σ(D) proteins from Aeribacillus pallidus AC6 and Bacillus subtilis bound specifically, albeit weakly, to promoter DNA even in the absence of core RNA polymerase. Binding required a conserved CG motif within the -10 element, and this motif is known to be recognized by σ region 2.4 and critical for promoter activity.

  20. DNA-Binding Properties of the Bacillus subtilis and Aeribacillus pallidus AC6 σD Proteins▿

    OpenAIRE

    Sevim, Elif; Gaballa, Ahmed; Beldüz, A. Osman; Helmann, John D.

    2010-01-01

    σD proteins from Aeribacillus pallidus AC6 and Bacillus subtilis bound specifically, albeit weakly, to promoter DNA even in the absence of core RNA polymerase. Binding required a conserved CG motif within the −10 element, and this motif is known to be recognized by σ region 2.4 and critical for promoter activity.

  1. Motif signatures of transcribed enhancers

    KAUST Repository

    Kleftogiannis, Dimitrios

    2017-09-14

    In mammalian cells, transcribed enhancers (TrEn) play important roles in the initiation of gene expression and maintenance of gene expression levels in spatiotemporal manner. One of the most challenging questions in biology today is how the genomic characteristics of enhancers relate to enhancer activities. This is particularly critical, as several recent studies have linked enhancer sequence motifs to specific functional roles. To date, only a limited number of enhancer sequence characteristics have been investigated, leaving space for exploring the enhancers genomic code in a more systematic way. To address this problem, we developed a novel computational method, TELS, aimed at identifying predictive cell type/tissue specific motif signatures. We used TELS to compile a comprehensive catalog of motif signatures for all known TrEn identified by the FANTOM5 consortium across 112 human primary cells and tissues. Our results confirm that distinct cell type/tissue specific motif signatures characterize TrEn. These signatures allow discriminating successfully a) TrEn from random controls, proxy of non-enhancer activity, and b) cell type/tissue specific TrEn from enhancers expressed and transcribed in different cell types/tissues. TELS codes and datasets are publicly available at http://www.cbrc.kaust.edu.sa/TELS.

  2. Counting of oligomers in sequences generated by markov chains for DNA motif discovery.

    Science.gov (United States)

    Shan, Gao; Zheng, Wei-Mou

    2009-02-01

    By means of the technique of the imbedded Markov chain, an efficient algorithm is proposed to exactly calculate first, second moments of word counts and the probability for a word to occur at least once in random texts generated by a Markov chain. A generating function is introduced directly from the imbedded Markov chain to derive asymptotic approximations for the problem. Two Z-scores, one based on the number of sequences with hits and the other on the total number of word hits in a set of sequences, are examined for discovery of motifs on a set of promoter sequences extracted from A. thaliana genome. Source code is available at http://www.itp.ac.cn/zheng/oligo.c.

  3. Mcm10 regulates DNA replication elongation by stimulating the CMG replicative helicase.

    Science.gov (United States)

    Lõoke, Marko; Maloney, Michael F; Bell, Stephen P

    2017-02-01

    Activation of the Mcm2-7 replicative DNA helicase is the committed step in eukaryotic DNA replication initiation. Although Mcm2-7 activation requires binding of the helicase-activating proteins Cdc45 and GINS (forming the CMG complex), an additional protein, Mcm10, drives initial origin DNA unwinding by an unknown mechanism. We show that Mcm10 binds a conserved motif located between the oligonucleotide/oligosaccharide fold (OB-fold) and A subdomain of Mcm2. Although buried in the interface between these domains in Mcm2-7 structures, mutations predicted to separate the domains and expose this motif restore growth to conditional-lethal MCM10 mutant cells. We found that, in addition to stimulating initial DNA unwinding, Mcm10 stabilizes Cdc45 and GINS association with Mcm2-7 and stimulates replication elongation in vivo and in vitro. Furthermore, we identified a lethal allele of MCM10 that stimulates initial DNA unwinding but is defective in replication elongation and CMG binding. Our findings expand the roles of Mcm10 during DNA replication and suggest a new model for Mcm10 function as an activator of the CMG complex throughout DNA replication. © 2017 Lõoke et al.; Published by Cold Spring Harbor Laboratory Press.

  4. The EPIYA-ABCC motif pattern in CagA of Helicobacter pylori is associated with peptic ulcer and gastric cancer in Mexican population.

    Science.gov (United States)

    Beltrán-Anaya, Fredy Omar; Poblete, Tomás Manuel; Román-Román, Adolfo; Reyes, Salomón; de Sampedro, José; Peralta-Zaragoza, Oscar; Rodríguez, Miguel Ángel; del Moral-Hernández, Oscar; Illades-Aguiar, Berenice; Fernández-Tilapa, Gloria

    2014-12-24

    Helicobacter pylori chronic infection is associated with chronic gastritis, peptic ulcer, and gastric cancer. Cytotoxin-associated gene A (cagA)-positive H. pylori strains increase the risk of gastric pathology. The carcinogenic potential of CagA is linked to its polymorphic EPIYA motif variants. The goals of this study were to investigate the frequency of cagA-positive Helicobacter pylori in Mexican patients with gastric pathologies and to assess the association of cagA EPIYA motif patterns with peptic ulcer and gastric cancer. A total of 499 patients were studied; of these, 402 had chronic gastritis, 77 had peptic ulcer, and 20 had gastric cancer. H. pylori DNA, cagA, and the EPIYA motifs were detected in total DNA from gastric biopsies by PCR. The type and number of EPIYA segments were determined by the electrophoretic patterns. To confirm the PCR results, 20 amplicons of the cagA 3' variable region were sequenced, and analyzed in silico, and the amino acid sequence was predicted with MEGA software, version 5. The odds ratio (OR) was calculated to determine the associations between the EPIYA motif type and gastric pathology and between the number of EPIYA-C segments and peptic ulcers and gastric cancer. H. pylori DNA was found in 287 (57.5%) of the 499 patients, and 214 (74%) of these patients were cagA-positive. The frequency of cagA-positive H. pylori was 74.6% (164/220) in chronic gastritis patients, 73.6% (39/53) in peptic ulcer patients, and 78.6% (11/14) in gastric cancer patients. The EPIYA-ABC pattern was more frequently observed in chronic gastritis patients (79.3%, 130/164), while the EPIYA-ABCC sequence was more frequently observed in peptic ulcer (64.1%, 25/39) and gastric cancer patients (54.5%, 6/11). However, the risks of peptic ulcer (OR = 7.0, 95% CI = 3.3-15.1; p peptic ulcers and gastric cancer.

  5. The carboxy-terminal domain of Dictyostelium C-module-binding factor is an independent gene regulatory entity.

    Directory of Open Access Journals (Sweden)

    Jörg Lucas

    Full Text Available The C-module-binding factor (CbfA is a multidomain protein that belongs to the family of jumonji-type (JmjC transcription regulators. In the social amoeba Dictyostelium discoideum, CbfA regulates gene expression during the unicellular growth phase and multicellular development. CbfA and a related D. discoideum CbfA-like protein, CbfB, share a paralogous domain arrangement that includes the JmjC domain, presumably a chromatin-remodeling activity, and two zinc finger-like (ZF motifs. On the other hand, the CbfA and CbfB proteins have completely different carboxy-terminal domains, suggesting that the plasticity of such domains may have contributed to the adaptation of the CbfA-like transcription factors to the rapid genome evolution in the dictyostelid clade. To support this hypothesis we performed DNA microarray and real-time RT-PCR measurements and found that CbfA regulates at least 160 genes during the vegetative growth of D. discoideum cells. Functional annotation of these genes revealed that CbfA predominantly controls the expression of gene products involved in housekeeping functions, such as carbohydrate, purine nucleoside/nucleotide, and amino acid metabolism. The CbfA protein displays two different mechanisms of gene regulation. The expression of one set of CbfA-dependent genes requires at least the JmjC/ZF domain of the CbfA protein and thus may depend on chromatin modulation. Regulation of the larger group of genes, however, does not depend on the entire CbfA protein and requires only the carboxy-terminal domain of CbfA (CbfA-CTD. An AT-hook motif located in CbfA-CTD, which is known to mediate DNA binding to A+T-rich sequences in vitro, contributed to CbfA-CTD-dependent gene regulatory functions in vivo.

  6. In silico discovery of transcription regulatory elements in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Le Roch Karine G

    2008-02-01

    Full Text Available Abstract Background With the sequence of the Plasmodium falciparum genome and several global mRNA and protein life cycle expression profiling projects now completed, elucidating the underlying networks of transcriptional control important for the progression of the parasite life cycle is highly pertinent to the development of new anti-malarials. To date, relatively little is known regarding the specific mechanisms the parasite employs to regulate gene expression at the mRNA level, with studies of the P. falciparum genome sequence having revealed few cis-regulatory elements and associated transcription factors. Although it is possible the parasite may evoke mechanisms of transcriptional control drastically different from those used by other eukaryotic organisms, the extreme AT-rich nature of P. falciparum intergenic regions (~90% AT presents significant challenges to in silico cis-regulatory element discovery. Results We have developed an algorithm called Gene Enrichment Motif Searching (GEMS that uses a hypergeometric-based scoring function and a position-weight matrix optimization routine to identify with high-confidence regulatory elements in the nucleotide-biased and repeat sequence-rich P. falciparum genome. When applied to promoter regions of genes contained within 21 co-expression gene clusters generated from P. falciparum life cycle microarray data using the semi-supervised clustering algorithm Ontology-based Pattern Identification, GEMS identified 34 putative cis-regulatory elements associated with a variety of parasite processes including sexual development, cell invasion, antigenic variation and protein biosynthesis. Among these candidates were novel motifs, as well as many of the elements for which biological experimental evidence already exists in the Plasmodium literature. To provide evidence for the biological relevance of a cell invasion-related element predicted by GEMS, reporter gene and electrophoretic mobility shift assays

  7. In vivo protein-DNA interactions at the β-globin gene locus

    International Nuclear Information System (INIS)

    Tohru Ikuta; Yuet Wai Kan

    1991-01-01

    The authors have investigated in vivo protein-DNA interactions in the β-globin gene locus by dimethyl sulfate (DMS) footprinting in K562 cells, which express var-epsilon- and γ-globin but not β-globin. In the locus control region, hypersensitive site 2 (HS-2) exhibited footprints in several putative protein binding motifs. HS-3 was not footprinted. The β promoter was also not footprinted, while extensive footprints were observed in the promoter of the active γ-globin gene. No footprints were seen in the A γ and β3' enhancers. With several motifs, additional protein interactions and alterations in binding patterns occurred with hemin induction. In HeLa cells, some footprints were observed in some of the motifs in HS-2, compatible with the finding that HS-2 has some enhancer function in HeLa cells, albeit much weaker than its activity in K562 cells. No footprint was seen in B lymphocytes. In vivo footprinting is a useful method for studying relevant protein-DNA interactions in erythroid cells

  8. DNA-Binding Properties of the Bacillus subtilis and Aeribacillus pallidus AC6 σD Proteins▿

    Science.gov (United States)

    Sevim, Elif; Gaballa, Ahmed; Beldüz, A. Osman; Helmann, John D.

    2011-01-01

    σD proteins from Aeribacillus pallidus AC6 and Bacillus subtilis bound specifically, albeit weakly, to promoter DNA even in the absence of core RNA polymerase. Binding required a conserved CG motif within the −10 element, and this motif is known to be recognized by σ region 2.4 and critical for promoter activity. PMID:21097624

  9. Common and distinct DNA-binding and regulatory activities of the BEN-solo transcription factor family.

    Science.gov (United States)

    Dai, Qi; Ren, Aiming; Westholm, Jakub O; Duan, Hong; Patel, Dinshaw J; Lai, Eric C

    2015-01-01

    Recently, the BEN (BANP, E5R, and NAC1) domain was recognized as a new class of conserved DNA-binding domain. The fly genome encodes three proteins that bear only a single BEN domain ("BEN-solo" factors); namely, Insensitive (Insv), Bsg25A (Elba1), and CG9883 (Elba2). Insv homodimers preferentially bind CCAATTGG palindromes throughout the genome to mediate transcriptional repression, whereas Bsg25A and Elba2 heterotrimerize with their obligate adaptor, Elba3 (i.e., the ELBA complex), to recognize a CCAATAAG motif in the Fab-7 insulator. While these data suggest distinct DNA-binding properties of BEN-solo proteins, we performed reporter assays that indicate that both Bsg25A and Elba2 can individually recognize Insv consensus sites efficiently. We confirmed this by solving the structure of Bsg25A complexed to the Insv site, which showed that key aspects of the BEN:DNA recognition strategy are similar between these proteins. We next show that both Insv and ELBA proteins are competent to mediate transcriptional repression via Insv consensus sequences but that the ELBA complex appears to be selective for the ELBA site. Reciprocally, genome-wide analysis reveals that Insv exhibits significant cobinding to class I insulator elements, indicating that it may also contribute to insulator function. Indeed, we observed abundant Insv binding within the Hox complexes with substantial overlaps with class I insulators, many of which bear Insv consensus sites. Moreover, Insv coimmunoprecipitates with the class I insulator factor CP190. Finally, we observed that Insv harbors exclusive activity among fly BEN-solo factors with respect to regulation of Notch-mediated cell fate choices in the peripheral nervous system. This in vivo activity is recapitulated by BEND6, a mammalian BEN-solo factor that conserves the Notch corepressor function of Insv but not its capacity to bind Insv consensus sites. Altogether, our data define an array of common and distinct biochemical and functional

  10. A single-laboratory validated method for the generation of DNA barcodes for the identification of fish for regulatory compliance.

    Science.gov (United States)

    Handy, Sara M; Deeds, Jonathan R; Ivanova, Natalia V; Hebert, Paul D N; Hanner, Robert H; Ormos, Andrea; Weigt, Lee A; Moore, Michelle M; Yancy, Haile F

    2011-01-01

    The U.S. Food and Drug Administration is responsible for ensuring that the nation's food supply is safe and accurately labeled. This task is particularly challenging in the case of seafood where a large variety of species are marketed, most of this commodity is imported, and processed product is difficult to identify using traditional morphological methods. Reliable species identification is critical for both foodborne illness investigations and for prevention of deceptive practices, such as those where species are intentionally mislabeled to circumvent import restrictions or for resale as species of higher value. New methods that allow accurate and rapid species identifications are needed, but any new methods to be used for regulatory compliance must be both standardized and adequately validated. "DNA barcoding" is a process by which species discriminations are achieved through the use of short, standardized gene fragments. For animals, a fragment (655 base pairs starting near the 5' end) of the cytochrome c oxidase subunit 1 mitochondrial gene has been shown to provide reliable species level discrimination in most cases. We provide here a protocol with single-laboratory validation for the generation of DNA barcodes suitable for the identification of seafood products, specifically fish, in a manner that is suitable for FDA regulatory use.

  11. Triadic motifs in the dependence networks of virtual societies

    Science.gov (United States)

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  12. Triadic motifs in the dependence networks of virtual societies.

    Science.gov (United States)

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-10

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  13. [Cloning of cDNA for RNA polymerase subunit from the fission yeast Schizosaccharomyces pombe by heterospecific complementation in Saccharomyces cerevisiae].

    Science.gov (United States)

    Shpakovskiĭ, G V; Lebedenko, E N; Thuriaux, P

    1997-02-01

    The rpb10 cDNA of the fission yeast Schizosaccharomyces pombe, encoding one of the five small subunits common to all three nuclear DNA-dependent RNA polymerases, was isolated from an expression cDNA library by two independent approaches: PCR-based screening and direct suppression by means of heterospecific complementation of a temperature-sensitive mutant defective in the corresponding gene of Saccharomyces cerevisiae. The cloned Sz. pombe cDNA encodes a protein Rpb10 of 71 amino acids with an M of 8,275 Da, sharing 51 amino acids (71% identity) with the subunit ABC10 beta of RNA polymerases I-III from S. cerevisiae. All eukaryotic members of this protein family have the same general organization featuring two highly conserved motifs (RCFT/SCGK and RYCCRRM) around an atypical zinc finger and an additional invariant HVDLIEK motif toward the C-terminal end. The last motif is only characteristics for homologs from eukaryotes. In keeping with this remarkable structural conservation, the Sz. pombe cDNA also fully complemented a S. cerevisiae deletion mutant lacking subunit ABC10 beta (null allele rpb10-delta 1::HIS3).

  14. Phospho-Ser/Thr-binding domains: navigating the cell cycle and DNA damage response.

    Science.gov (United States)

    Reinhardt, H Christian; Yaffe, Michael B

    2013-09-01

    Coordinated progression through the cell cycle is a complex challenge for eukaryotic cells. Following genotoxic stress, diverse molecular signals must be integrated to establish checkpoints specific for each cell cycle stage, allowing time for various types of DNA repair. Phospho-Ser/Thr-binding domains have emerged as crucial regulators of cell cycle progression and DNA damage signalling. Such domains include 14-3-3 proteins, WW domains, Polo-box domains (in PLK1), WD40 repeats (including those in the E3 ligase SCF(βTrCP)), BRCT domains (including those in BRCA1) and FHA domains (such as in CHK2 and MDC1). Progress has been made in our understanding of the motif (or motifs) that these phospho-Ser/Thr-binding domains connect with on their targets and how these interactions influence the cell cycle and DNA damage response.

  15. A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

    Science.gov (United States)

    Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

    2013-07-18

    Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.

  16. The Rev1 interacting region (RIR) motif in the scaffold protein XRCC1 mediates a low-affinity interaction with polynucleotide kinase/phosphatase (PNKP) during DNA single-strand break repair.

    Science.gov (United States)

    Breslin, Claire; Mani, Rajam S; Fanta, Mesfin; Hoch, Nicolas; Weinfeld, Michael; Caldecott, Keith W

    2017-09-29

    The scaffold protein X-ray repair cross-complementing 1 (XRCC1) interacts with multiple enzymes involved in DNA base excision repair and single-strand break repair (SSBR) and is important for genetic integrity and normal neurological function. One of the most important interactions of XRCC1 is that with polynucleotide kinase/phosphatase (PNKP), a dual-function DNA kinase/phosphatase that processes damaged DNA termini and that, if mutated, results in ataxia with oculomotor apraxia 4 (AOA4) and microcephaly with early-onset seizures and developmental delay (MCSZ). XRCC1 and PNKP interact via a high-affinity phosphorylation-dependent interaction site in XRCC1 and a forkhead-associated domain in PNKP. Here, we identified using biochemical and biophysical approaches a second PNKP interaction site in XRCC1 that binds PNKP with lower affinity and independently of XRCC1 phosphorylation. However, this interaction nevertheless stimulated PNKP activity and promoted SSBR and cell survival. The low-affinity interaction site required the highly conserved Rev1-interacting region (RIR) motif in XRCC1 and included three critical and evolutionarily invariant phenylalanine residues. We propose a bipartite interaction model in which the previously identified high-affinity interaction acts as a molecular tether, holding XRCC1 and PNKP together and thereby promoting the low-affinity interaction identified here, which then stimulates PNKP directly. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. A cis-regulatory sequence driving metabolic insecticide resistance in mosquitoes: functional characterisation and signatures of selection.

    Science.gov (United States)

    Wilding, Craig S; Smith, Ian; Lynd, Amy; Yawson, Alexander Egyir; Weetman, David; Paine, Mark J I; Donnelly, Martin J

    2012-09-01

    Although cytochrome P450 (CYP450) enzymes are frequently up-regulated in mosquitoes resistant to insecticides, no regulatory motifs driving these expression differences with relevance to wild populations have been identified. Transposable elements (TEs) are often enriched upstream of those CYP450s involved in insecticide resistance, leading to the assumption that they contribute regulatory motifs that directly underlie the resistance phenotype. A partial CuRE1 (Culex Repetitive Element 1) transposable element is found directly upstream of CYP9M10, a cytochrome P450 implicated previously in larval resistance to permethrin in the ISOP450 strain of Culex quinquefasciatus, but is absent from the equivalent genomic region of a susceptible strain. Via expression of CYP9M10 in Escherichia coli we have now demonstrated time- and NADPH-dependant permethrin metabolism, prerequisites for confirmation of a role in metabolic resistance, and through qPCR shown that CYP9M10 is >20-fold over-expressed in ISOP450 compared to a susceptible strain. In a fluorescent reporter assay the region upstream of CYP9M10 from ISOP450 drove 10× expression compared to the equivalent region (lacking CuRE1) from the susceptible strain. Close correspondence with the gene expression fold-change implicates the upstream region including CuRE1 as a cis-regulatory element involved in resistance. Only a single CuRE1 bearing allele, identical to the CuRE1 bearing allele in the resistant strain, is found throughout Sub-Saharan Africa, in contrast to the diversity encountered in non-CuRE1 alleles. This suggests a single origin and subsequent spread due to selective advantage. CuRE1 is detectable using a simple diagnostic. When applied to C. quinquefasciatus larvae from Ghana we have demonstrated a significant association with permethrin resistance in multiple field sites (mean Odds Ratio = 3.86) suggesting this marker has relevance to natural populations of vector mosquitoes. However, when CuRE1 was excised

  18. Piv site-specific invertase requires a DEDD motif analogous to the catalytic center of the RuvC Holliday junction resolvases.

    Science.gov (United States)

    Buchner, John M; Robertson, Anne E; Poynter, David J; Denniston, Shelby S; Karls, Anna C

    2005-05-01

    Piv, a unique prokaryotic site-specific DNA invertase, is related to transposases of the insertion elements from the IS110/IS492 family and shows no similarity to the site-specific recombinases of the tyrosine- or serine-recombinase families. Piv tertiary structure is predicted to include the RNase H-like fold that typically encompasses the catalytic site of the recombinases or nucleases of the retroviral integrase superfamily, including transposases and RuvC-like Holliday junction resolvases. Analogous to the DDE and DEDD catalytic motifs of transposases and RuvC, respectively, four Piv acidic residues D9, E59, D101, and D104 appear to be positioned appropriately within the RNase H fold to coordinate two divalent metal cations. This suggests mechanistic similarity between site-specific inversion mediated by Piv and transposition or endonucleolytic reactions catalyzed by enzymes of the retroviral integrase superfamily. The role of the DEDD motif in Piv catalytic activity was addressed using Piv variants that are substituted individually or multiply at these acidic residues and assaying for in vivo inversion, intermolecular recombination, and DNA binding activities. The results indicate that all four residues of the DEDD motif are required for Piv catalytic activity. The DEDD residues are not essential for inv recombination site recognition and binding, but this acidic tetrad does appear to contribute to the stability of Piv-inv interactions. On the basis of these results, a working model for Piv-mediated inversion that includes resolution of a Holliday junction is presented.

  19. In silico modeling of epigenetic-induced changes in photoreceptor cis-regulatory elements.

    Science.gov (United States)

    Hossain, Reafa A; Dunham, Nicholas R; Enke, Raymond A; Berndsen, Christopher E

    2018-01-01

    DNA methylation is a well-characterized epigenetic repressor of mRNA transcription in many plant and vertebrate systems. However, the mechanism of this repression is not fully understood. The process of transcription is controlled by proteins that regulate recruitment and activity of RNA polymerase by binding to specific cis-regulatory sequences. Cone-rod homeobox (CRX) is a well-characterized mammalian transcription factor that controls photoreceptor cell-specific gene expression. Although much is known about the functions and DNA binding specificity of CRX, little is known about how DNA methylation modulates CRX binding affinity to genomic cis-regulatory elements. We used bisulfite pyrosequencing of human ocular tissues to measure DNA methylation levels of the regulatory regions of RHO , PDE6B, PAX6 , and LINE1 retrotransposon repeats. To describe the molecular mechanism of repression, we used molecular modeling to illustrate the effect of DNA methylation on human RHO regulatory sequences. In this study, we demonstrate an inverse correlation between DNA methylation in regulatory regions adjacent to the human RHO and PDE6B genes and their subsequent transcription in human ocular tissues. Docking of CRX to the DNA models shows that CRX interacts with the grooves of these sequences, suggesting changes in groove structure could regulate binding. Molecular dynamics simulations of the RHO promoter and enhancer regions show changes in the flexibility and groove width upon epigenetic modification. Models also demonstrate changes in the local dynamics of CRX binding sites within RHO regulatory sequences which may account for the repression of CRX-dependent transcription. Collectively, these data demonstrate epigenetic regulation of CRX binding sites in human retinal tissue and provide insight into the mechanism of this mode of epigenetic regulation to be tested in future experiments.

  20. A Conserved EAR Motif Is Required for Avirulence and Stability of the Ralstonia solanacearum Effector PopP2 In Planta

    Directory of Open Access Journals (Sweden)

    Cécile Segonzac

    2017-08-01

    Full Text Available Ralstonia solanacearum is the causal agent of the devastating bacterial wilt disease in many high value Solanaceae crops. R. solanacearum secretes around 70 effectors into host cells in order to promote infection. Plants have, however, evolved specialized immune receptors that recognize corresponding effectors and confer qualitative disease resistance. In the model species Arabidopsis thaliana, the paired immune receptors RRS1 (resistance to Ralstonia solanacearum 1 and RPS4 (resistance to Pseudomonas syringae 4 cooperatively recognize the R. solanacearum effector PopP2 in the nuclei of infected cells. PopP2 is an acetyltransferase that binds to and acetylates the RRS1 WRKY DNA-binding domain resulting in reduced RRS1-DNA association thereby activating plant immunity. Here, we surveyed the naturally occurring variation in PopP2 sequence among the R. solanacearum strains isolated from diseased tomato and pepper fields across the Republic of Korea. Our analysis revealed high conservation of popP2 sequence with only three polymorphic alleles present amongst 17 strains. Only one variation (a premature stop codon caused the loss of RPS4/RRS1-dependent recognition in Arabidopsis. We also found that PopP2 harbors a putative eukaryotic transcriptional repressor motif (ethylene-responsive element binding factor-associated amphiphilic repression or EAR, which is known to be involved in the recruitment of transcriptional co-repressors. Remarkably, mutation of the EAR motif disabled PopP2 avirulence function as measured by the development of hypersensitive response, electrolyte leakage, defense marker gene expression and bacterial growth in Arabidopsis. This lack of recognition was partially but significantly reverted by the C-terminal addition of a synthetic EAR motif. We show that the EAR motif-dependent gain of avirulence correlated with the stability of the PopP2 protein. Furthermore, we demonstrated the requirement of the PopP2 EAR motif for PTI

  1. Distribution of CpG Motifs in Upstream Gene Domains in a Reef Coral and Sea Anemone: Implications for Epigenetics in Cnidarians.

    Science.gov (United States)

    Marsh, Adam G; Hoadley, Kenneth D; Warner, Mark E

    2016-01-01

    Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG) motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera) and an anemone (Nematostella vectensis). Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS) are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to environmental stress in

  2. Distribution of CpG Motifs in Upstream Gene Domains in a Reef Coral and Sea Anemone: Implications for Epigenetics in Cnidarians.

    Directory of Open Access Journals (Sweden)

    Adam G Marsh

    Full Text Available Coral reefs are under assault from stressors including global warming, ocean acidification, and urbanization. Knowing how these factors impact the future fate of reefs requires delineating stress responses across ecological, organismal and cellular scales. Recent advances in coral reef biology have integrated molecular processes with ecological fitness and have identified putative suites of temperature acclimation genes in a Scleractinian coral Acropora hyacinthus. We wondered what unique characteristics of these genes determined their coordinate expression in response to temperature acclimation, and whether or not other corals and cnidarians would likewise possess these features. Here, we focus on cytosine methylation as an epigenetic DNA modification that is responsive to environmental stressors. We identify common conserved patterns of cytosine-guanosine dinucleotide (CpG motif frequencies in upstream promoter domains of different functional gene groups in two cnidarian genomes: a coral (Acropora digitifera and an anemone (Nematostella vectensis. Our analyses show that CpG motif frequencies are prominent in the promoter domains of functional genes associated with environmental adaptation, particularly those identified in A. hyacinthus. Densities of CpG sites in upstream promoter domains near the transcriptional start site (TSS are 1.38x higher than genomic background levels upstream of -2000 bp from the TSS. The increase in CpG usage suggests selection to allow for DNA methylation events to occur more frequently within 1 kb of the TSS. In addition, observed shifts in CpG densities among functional groups of genes suggests a potential role for epigenetic DNA methylation within promoter domains to impact functional gene expression responses in A. digitifera and N. vectensis. Identifying promoter epigenetic sequence motifs among genes within specific functional groups establishes an approach to describe integrated cellular responses to

  3. SAMHD1 Sheds Moonlight on DNA Double-Strand Break Repair.

    Science.gov (United States)

    Cabello-Lobato, Maria Jose; Wang, Siyue; Schmidt, Christine Katrin

    2017-12-01

    SAMHD1 (sterile α motif and histidine (H) aspartate (D) domain-containing protein 1) is known for its antiviral activity of hydrolysing deoxynucleotides required for virus replication. Daddacha et al. identify a hydrolase-independent, moonlighting function of SAMHD1 that facilitates homologous recombination of DNA double-strand breaks (DSBs) by promoting recruitment of C-terminal binding protein interacting protein (CTIP), a DNA-end resection factor, to damaged DNA. These findings could benefit anticancer treatment. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. Structure solution of DNA-binding proteins and complexes with ARCIMBOLDO libraries

    Energy Technology Data Exchange (ETDEWEB)

    Pröpper, Kevin [University of Göttingen, (Germany); Instituto de Biologia Molecular de Barcelona (IBMB-CSIC), (Spain); Meindl, Kathrin; Sammito, Massimo [Instituto de Biologia Molecular de Barcelona (IBMB-CSIC), (Spain); Dittrich, Birger; Sheldrick, George M. [University of Göttingen, (Germany); Pohl, Ehmke, E-mail: ehmke.pohl@durham.ac.uk [Durham University, (United Kingdom); Usón, Isabel, E-mail: ehmke.pohl@durham.ac.uk [Instituto de Biologia Molecular de Barcelona (IBMB-CSIC), (Spain); Institucio Catalana de Recerca i Estudis Avancats (ICREA), (Spain); University of Göttingen, (Germany)

    2014-06-01

    The structure solution of DNA-binding protein structures and complexes based on the combination of location of DNA-binding protein motif fragments with density modification in a multi-solution frame is described. Protein–DNA interactions play a major role in all aspects of genetic activity within an organism, such as transcription, packaging, rearrangement, replication and repair. The molecular detail of protein–DNA interactions can be best visualized through crystallography, and structures emphasizing insight into the principles of binding and base-sequence recognition are essential to understanding the subtleties of the underlying mechanisms. An increasing number of high-quality DNA-binding protein structure determinations have been witnessed despite the fact that the crystallographic particularities of nucleic acids tend to pose specific challenges to methods primarily developed for proteins. Crystallographic structure solution of protein–DNA complexes therefore remains a challenging area that is in need of optimized experimental and computational methods. The potential of the structure-solution program ARCIMBOLDO for the solution of protein–DNA complexes has therefore been assessed. The method is based on the combination of locating small, very accurate fragments using the program Phaser and density modification with the program SHELXE. Whereas for typical proteins main-chain α-helices provide the ideal, almost ubiquitous, small fragments to start searches, in the case of DNA complexes the binding motifs and DNA double helix constitute suitable search fragments. The aim of this work is to provide an effective library of search fragments as well as to determine the optimal ARCIMBOLDO strategy for the solution of this class of structures.

  5. DNA Methylation of Regulatory Regions of Imprinted Genes at Birth and Its Relation to Infant Temperament

    Directory of Open Access Journals (Sweden)

    Bernard F. Fuemmeler

    2016-01-01

    Full Text Available BACKGROUND DNA methylation of the differentially methylated regions (DMRs of imprinted genes is relevant to neurodevelopment. METHODS DNA methylation status of the DMRs of nine imprinted genes in umbilical cord blood leukocytes was analyzed in relation to infant behaviors and temperament (n = 158. RESULTS MEG3 DMR levels were positively associated with internalizing ( β = 0.15, P = 0.044 and surgency ( β = 0.19, P = 0.018 behaviors, after adjusting for birth weight, gender, gestational age at birth, maternal age at delivery, race/ethnicity, education level, smoking status, parity, and a history of anxiety or depression. Higher methylation levels at the intergenic MEG3-IG methylation regions were associated with surgency ( β = 0.28, P = 0.0003 and PEG3 was positively related to externalizing ( β = 0.20, P = 0.01 and negative affectivity ( β = 0.18, P = 0.02. CONCLUSION While the small sample size limits inference, these pilot data support gene-specific associations between epigenetic differences in regulatory regions of imprinted domains at birth and later infant temperament.

  6. A Built-In CpG Adjuvant in RSV F Protein DNA Vaccine Drives a Th1 Polarized and Enhanced Protective Immune Response

    Directory of Open Access Journals (Sweden)

    Yao Ma

    2018-01-01

    Full Text Available Human respiratory syncytial virus (RSV is the most significant cause of acute lower respiratory infection in children. However, there is no licensed vaccine available. Here, we investigated the effect of five or 20 copies of C-Class of CpG ODN (CpG-C motif incorporated into a plasmid DNA vaccine encoding RSV fusion (F glycoprotein on the vaccine-induced immune response. The addition of CpG-C motif enhanced serum binding and virus-neutralizing antibody responses in BALB/c mice immunized with the DNA vaccines. Moreover, mice vaccinated with CpG-modified vaccines, especially with the higher 20 copies, resulted in an enhanced shift toward a Th1-biased antibody and T-cell response, a decrease in pulmonary pathology and virus replication, and a decrease in weight loss after RSV challenge. This study suggests that CpG-C motif, cloned into the backbone of DNA vaccine encoding RSV F glycoprotein, functions as a built-in adjuvant capable of improving the efficacy of DNA vaccine against RSV infection.

  7. Evolution of New cis-Regulatory Motifs Required for Cell-Specific Gene Expression in Caenorhabditis.

    Directory of Open Access Journals (Sweden)

    Michalis Barkoulas

    2016-09-01

    Full Text Available Patterning of C. elegans vulval cell fates relies on inductive signaling. In this induction event, a single cell, the gonadal anchor cell, secretes LIN-3/EGF and induces three out of six competent precursor cells to acquire a vulval fate. We previously showed that this developmental system is robust to a four-fold variation in lin-3/EGF genetic dose. Here using single-molecule FISH, we find that the mean level of expression of lin-3 in the anchor cell is remarkably conserved. No change in lin-3 expression level could be detected among C. elegans wild isolates and only a low level of change-less than 30%-in the Caenorhabditis genus and in Oscheius tipulae. In C. elegans, lin-3 expression in the anchor cell is known to require three transcription factor binding sites, specifically two E-boxes and a nuclear-hormone-receptor (NHR binding site. Mutation of any of these three elements in C. elegans results in a dramatic decrease in lin-3 expression. Yet only a single E-box is found in the Drosophilae supergroup of Caenorhabditis species, including C. angaria, while the NHR-binding site likely only evolved at the base of the Elegans group. We find that a transgene from C. angaria bearing a single E-box is sufficient for normal expression in C. elegans. Even a short 58 bp cis-regulatory fragment from C. angaria with this single E-box is able to replace the three transcription factor binding sites at the endogenous C. elegans lin-3 locus, resulting in the wild-type expression level. Thus, regulatory evolution occurring in cis within a 58 bp lin-3 fragment, results in a strict requirement for the NHR binding site and a second E-box in C. elegans. This single-cell, single-molecule, quantitative and functional evo-devo study demonstrates that conserved expression levels can hide extensive change in cis-regulatory site requirements and highlights the evolution of new cis-regulatory elements required for cell-specific gene expression.

  8. Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

    Science.gov (United States)

    Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

    2018-05-14

    The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.

  9. Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements

    KAUST Repository

    Guturu, H.

    2013-11-11

    Mapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis-regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF-TF-DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein-protein interactions, potentially indirect interactions and \\'through-DNA\\' interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex.

  10. Structure-aided prediction of mammalian transcription factor complexes in conserved non-coding elements

    KAUST Repository

    Guturu, H.; Doxey, A. C.; Wenger, A. M.; Bejerano, G.

    2013-01-01

    Mapping the DNA-binding preferences of transcription factor (TF) complexes is critical for deciphering the functions of cis-regulatory elements. Here, we developed a computational method that compares co-occurring motif spacings in conserved versus unconserved regions of the human genome to detect evolutionarily constrained binding sites of rigid TF complexes. Structural data were used to estimate TF complex physical plausibility, explore overlapping motif arrangements seldom tackled by non-structure-aware methods, and generate and analyse three-dimensional models of the predicted complexes bound to DNA. Using this approach, we predicted 422 physically realistic TF complex motifs at 18% false discovery rate, the majority of which (326, 77%) contain some sequence overlap between binding sites. The set of mostly novel complexes is enriched in known composite motifs, predictive of binding site configurations in TF-TF-DNA crystal structures, and supported by ChIP-seq datasets. Structural modelling revealed three cooperativity mechanisms: direct protein-protein interactions, potentially indirect interactions and 'through-DNA' interactions. Indeed, 38% of the predicted complexes were found to contain four or more bases in which TF pairs appear to synergize through overlapping binding to the same DNA base pairs in opposite grooves or strands. Our TF complex and associated binding site predictions are available as a web resource at http://bejerano.stanford.edu/complex.

  11. In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana.

    Science.gov (United States)

    Ibraheem, Omodele; Botha, Christiaan E J; Bradley, Graeme

    2010-12-01

    The regulation of gene expression involves a multifarious regulatory system. Each gene contains a unique combination of cis-acting regulatory sequence elements in the 5' regulatory region that determines its temporal and spatial expression. Cis-acting regulatory elements are essential transcriptional gene regulatory units; they control many biological processes and stress responses. Thus a full understanding of the transcriptional gene regulation system will depend on successful functional analyses of cis-acting elements. Cis-acting regulatory elements present within the 5' regulatory region of the sucrose transporter gene families in rice (Oryza sativa Japonica cultivar-group) and Arabidopsis thaliana, were identified using a bioinformatics approach. The possible cis-acting regulatory elements were predicted by scanning 1.5kbp of 5' regulatory regions of the sucrose transporter genes translational start sites, using Plant CARE, PLACE and Genomatix Matinspector professional databases. Several cis-acting regulatory elements that are associated with plant development, plant hormonal regulation and stress response were identified, and were present in varying frequencies within the 1.5kbp of 5' regulatory region, among which are; A-box, RY, CAT, Pyrimidine-box, Sucrose-box, ABRE, ARF, ERE, GARE, Me-JA, ARE, DRE, GA-motif, GATA, GT-1, MYC, MYB, W-box, and I-box. This result reveals the probable cis-acting regulatory elements that possibly are involved in the expression and regulation of sucrose transporter gene families in rice and Arabidopsis thaliana during cellular development or environmental stress conditions. Copyright © 2010 Elsevier Ltd. All rights reserved.

  12. A conserved motif in the linker domain of STAT1 transcription factor is required for both recognition and release from high-affinity DNA-binding sites.

    Science.gov (United States)

    Hüntelmann, Bettina; Staab, Julia; Herrmann-Lingen, Christoph; Meyer, Thomas

    2014-01-01

    Binding to specific palindromic sequences termed gamma-activated sites (GAS) is a hallmark of gene activation by members of the STAT (signal transducer and activator of transcription) family of cytokine-inducible transcription factors. However, the precise molecular mechanisms involved in the signal-dependent finding of target genes by STAT dimers have not yet been very well studied. In this study, we have characterized a sequence motif in the STAT1 linker domain which is highly conserved among the seven human STAT proteins and includes surface-exposed residues in close proximity to the bound DNA. Using site-directed mutagenesis, we have demonstrated that a lysine residue in position 567 of the full-length molecule is required for GAS recognition. The substitution of alanine for this residue completely abolished both binding to high-affinity GAS elements and transcriptional activation of endogenous target genes in cells stimulated with interferon-γ (IFNγ), while the time course of transient nuclear accumulation and tyrosine phosphorylation were virtually unchanged. In contrast, two glutamic acid residues (E559 and E563) on each monomer are important for the dissociation of dimeric STAT1 from DNA and, when mutated to alanine, result in elevated levels of tyrosine-phosphorylated STAT1 as well as prolonged IFNγ-stimulated nuclear accumulation. In conclusion, our data indicate that the kinetics of signal-dependent GAS binding is determined by an array of glutamic acid residues located at the interior surface of the STAT1 dimer. These negatively charged residues appear to align the long axis of the STAT1 dimer in a position perpendicular to the DNA, thereby facilitating the interaction between lysine 567 and the phosphodiester backbone of a bound GAS element, which is a prerequisite for transient gene induction.

  13. Frequent non-reciprocal exchange in microsatellite-containing-DNA-regions of vertebrates

    DEFF Research Database (Denmark)

    Ziegler, J.O.; Wälther, M.; Linzer, T.R.

    2009-01-01

    Microsatellites are DNA-fragments containing short repetitive motifs with 2-10 bp. They are highly variable in most species and distributed throughout the whole genome. It is broadly accepted that their high degree of variability is closely associated with mispairing of DNA-strands during...... on stepwise mutation models should be interpreted with caution if no detailed information on the allelic variation of microsatellites is available....

  14. High resolution optical DNA mapping

    Science.gov (United States)

    Baday, Murat

    Many types of diseases including cancer and autism are associated with copy-number variations in the genome. Most of these variations could not be identified with existing sequencing and optical DNA mapping methods. We have developed Multi-color Super-resolution technique, with potential for high throughput and low cost, which can allow us to recognize more of these variations. Our technique has made 10--fold improvement in the resolution of optical DNA mapping. Using a 180 kb BAC clone as a model system, we resolved dense patterns from 108 fluorescent labels of two different colors representing two different sequence-motifs. Overall, a detailed DNA map with 100 bp resolution was achieved, which has the potential to reveal detailed information about genetic variance and to facilitate medical diagnosis of genetic disease.

  15. MIR@NT@N: a framework integrating transcription factors, microRNAs and their targets to identify sub-network motifs in a meta-regulation network model

    Directory of Open Access Journals (Sweden)

    Wasserman Wyeth W

    2011-03-01

    Full Text Available Abstract Background To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs, microRNAs (miRNAs and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regulatory molecular interactions at a genome-wide scale would be of great interest to biologists, but is not available to date. Results To identify and analyze molecular interaction networks, we developed MIR@NT@N, an integrative approach based on a meta-regulation network model and a large-scale database. MIR@NT@N uses a graph-based approach to predict novel molecular actors across multiple regulatory processes (i.e. TFs acting on protein-coding or miRNA genes, or miRNAs acting on messenger RNAs. Exploiting these predictions, the user can generate networks and further analyze them to identify sub-networks, including motifs such as feedback and feedforward loops (FBL and FFL. In addition, networks can be built from lists of molecular actors with an a priori role in a given biological process to predict novel and unanticipated interactions. Analyses can be contextualized and filtered by integrating additional information such as microarray expression data. All results, including generated graphs, can be visualized, saved and exported into various formats. MIR@NT@N performances have been evaluated using published data and then applied to the regulatory program underlying epithelium to mesenchyme transition (EMT, an evolutionary-conserved process which is implicated in embryonic development and disease. Conclusions MIR@NT@N is an effective computational approach to identify novel molecular regulations and to predict gene regulatory networks and sub-networks including conserved motifs within a given biological context. Taking advantage of the M@IA environment, MIR@NT@N is a user-friendly web resource freely available at http

  16. Automatic annotation of protein motif function with Gene Ontology terms

    Directory of Open Access Journals (Sweden)

    Gopalakrishnan Vanathi

    2004-09-01

    Full Text Available Abstract Background Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. Results This paperpresents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifsis viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association isfound to be a very useful feature. We take advantageof the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correctassociation. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. Conclusions In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about thefunctions of newly discovered candidate protein motifs.

  17. Verification of the MOTIF code version 3.0

    International Nuclear Information System (INIS)

    Chan, T.; Guvanasen, V.; Nakka, B.W.; Reid, J.A.K.; Scheier, N.W.; Stanchell, F.W.

    1996-12-01

    As part of the Canadian Nuclear Fuel Waste Management Program (CNFWMP), AECL has developed a three-dimensional finite-element code, MOTIF (Model Of Transport In Fractured/ porous media), for detailed modelling of groundwater flow, heat transport and solute transport in a fractured rock mass. The code solves the transient and steady-state equations of groundwater flow, solute (including one-species radionuclide) transport, and heat transport in variably saturated fractured/porous media. The initial development was completed in 1985 (Guvanasen 1985) and version 3.0 was completed in 1986. This version is documented in detail in Guvanasen and Chan (in preparation). This report describes a series of fourteen verification cases which has been used to test the numerical solution techniques and coding of MOTIF, as well as demonstrate some of the MOTIF analysis capabilities. For each case the MOTIF solution has been compared with a corresponding analytical or independently developed alternate numerical solution. Several of the verification cases were included in Level 1 of the International Hydrologic Code Intercomparison Project (HYDROCOIN). The MOTIF results for these cases were also described in the HYDROCOIN Secretariat's compilation and comparison of results submitted by the various project teams (Swedish Nuclear Power Inspectorate 1988). It is evident from the graphical comparisons presented that the MOTIF solutions for the fourteen verification cases are generally in excellent agreement with known analytical or numerical solutions obtained from independent sources. This series of verification studies has established the ability of the MOTIF finite-element code to accurately model the groundwater flow and solute and heat transport phenomena for which it is intended. (author). 20 refs., 14 tabs., 32 figs

  18. A Survey of 6,300 Genomic Fragments for cis-Regulatory Activity in the Imaginal Discs of Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Aurélie Jory

    2012-10-01

    Full Text Available Over 6,000 fragments from the genome of Drosophila melanogaster were analyzed for their ability to drive expression of GAL4 reporter genes in the third-instar larval imaginal discs. About 1,200 reporter genes drove expression in the eye, antenna, leg, wing, haltere, or genital imaginal discs. The patterns ranged from large regions to individual cells. About 75% of the active fragments drove expression in multiple discs; 20% were expressed in ventral, but not dorsal, discs (legs, genital, and antenna, whereas ∼23% were expressed in dorsal but not ventral discs (wing, haltere, and eye. Several patterns, for example, within the leg chordotonal organ, appeared a surprisingly large number of times. Unbiased searches for DNA sequence motifs suggest candidate transcription factors that may regulate enhancers with shared activities. Together, these expression patterns provide a valuable resource to the community and offer a broad overview of how transcriptional regulatory information is distributed in the Drosophila genome.

  19. Purification and functional motifs of the recombinant ATPase of orf virus.

    Science.gov (United States)

    Lin, Fong-Yuan; Chan, Kun-Wei; Wang, Chi-Young; Wong, Min-Liang; Hsu, Wei-Li

    2011-10-01

    Our previous study showed that the recombinant ATPase encoded by the A32L gene of orf virus displayed ATP hydrolysis activity as predicted from its amino acids sequence. This viral ATPase contains four known functional motifs (motifs I-IV) and a novel AYDG motif; they are essential for ATP hydrolysis reaction by binding ATP and magnesium ions. The motifs I and II correspond with the Walker A and B motifs of the typical ATPase, respectively. To examine the biochemical roles of these five conserved motifs, recombinant ATPases of five deletion mutants derived from the Taiping strain were expressed and purified. Their ATPase functions were assayed and compared with those of two wild type strains, Taiping and Nantou isolated in Taiwan. Our results showed that deletions at motifs I-III or IV exhibited lower activity than that of the wild type. Interestingly, deletion of AYDG motif decreased the ATPase activity more significantly than those of motifs I-IV deletions. Divalent ions such as magnesium and calcium were essential for ATPase activity. Moreover, our recombinant proteins of orf virus also demonstrated GTPase activity, though weaker than the original ATPase activity. Copyright © 2011 Elsevier Inc. All rights reserved.

  20. Hybrids of the bHLH and bZIP protein motifs display different DNA-binding activities in vivo vs. in vitro.

    Directory of Open Access Journals (Sweden)

    Hiu-Kwan Chow

    Full Text Available Minimalist hybrids comprising the DNA-binding domain of bHLH/PAS (basic-helix-loop-helix/Per-Arnt-Sim protein Arnt fused to the leucine zipper (LZ dimerization domain from bZIP (basic region-leucine zipper protein C/EBP were designed to bind the E-box DNA site, CACGTG, targeted by bHLHZ (basic-helix-loop-helix-zipper proteins Myc and Max, as well as the Arnt homodimer. The bHLHZ-like structure of ArntbHLH-C/EBP comprises the Arnt bHLH domain fused to the C/EBP LZ: i.e. swap of the 330 aa PAS domain for the 29 aa LZ. In the yeast one-hybrid assay (Y1H, transcriptional activation from the E-box was strong by ArntbHLH-C/EBP, and undetectable for the truncated ArntbHLH (PAS removed, as detected via readout from the HIS3 and lacZ reporters. In contrast, fluorescence anisotropy titrations showed affinities for the E-box with ArntbHLH-C/EBP and ArntbHLH comparable to other transcription factors (K(d 148.9 nM and 40.2 nM, respectively, but only under select conditions that maintained folded protein. Although in vivo yeast results and in vitro spectroscopic studies for ArntbHLH-C/EBP targeting the E-box correlate well, the same does not hold for ArntbHLH. As circular dichroism confirms that ArntbHLH-C/EBP is a much more strongly alpha-helical structure than ArntbHLH, we conclude that the nonfunctional ArntbHLH in the Y1H must be due to misfolding, leading to the false negative that this protein is incapable of targeting the E-box. Many experiments, including protein design and selections from large libraries, depend on protein domains remaining well-behaved in the nonnative experimental environment, especially small motifs like the bHLH (60-70 aa. Interestingly, a short helical LZ can serve as a folding- and/or solubility-enhancing tag, an important device given the focus of current research on exploration of vast networks of biomolecular interactions.

  1. A type III-B CRISPR-Cas effector complex mediating massive target DNA destruction.

    Science.gov (United States)

    Han, Wenyuan; Li, Yingjun; Deng, Ling; Feng, Mingxia; Peng, Wenfang; Hallstrøm, Søren; Zhang, Jing; Peng, Nan; Liang, Yun Xiang; White, Malcolm F; She, Qunxin

    2017-02-28

    The CRISPR (clustered regularly interspaced short palindromic repeats) system protects archaea and bacteria by eliminating nucleic acid invaders in a crRNA-guided manner. The Sulfolobus islandicus type III-B Cmr-α system targets invading nucleic acid at both RNA and DNA levels and DNA targeting relies on the directional transcription of the protospacer in vivo. To gain further insight into the involved mechanism, we purified a native effector complex of III-B Cmr-α from S. islandicus and characterized it in vitro. Cmr-α cleaved RNAs complementary to crRNA present in the complex and its ssDNA destruction activity was activated by target RNA. The ssDNA cleavage required mismatches between the 5΄-tag of crRNA and the 3΄-flanking region of target RNA. An invader plasmid assay showed that mutation either in the histidine-aspartate acid (HD) domain (a quadruple mutation) or in the GGDD motif of the Cmr-2α protein resulted in attenuation of the DNA interference in vivo. However, double mutation of the HD motif only abolished the DNase activity in vitro. Furthermore, the activated Cmr-α binary complex functioned as a highly active DNase to destroy a large excess DNA substrate, which could provide a powerful means to rapidly degrade replicating viral DNA. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Site-specific DNA Inversion by Serine Recombinases

    Science.gov (United States)

    2015-01-01

    Reversible site-specific DNA inversion reactions are widely distributed in bacteria and their viruses. They control a range of biological reactions that most often involve alterations of molecules on the surface of cells or phage. These programmed DNA rearrangements usually occur at a low frequency, thereby preadapting a small subset of the population to a change in environmental conditions, or in the case of phages, an expanded host range. A dedicated recombinase, sometimes with the aid of additional regulatory or DNA architectural proteins, catalyzes the inversion of DNA. RecA or other components of the general recombination-repair machinery are not involved. This chapter discusses site-specific DNA inversion reactions mediated by the serine recombinase family of enzymes and focuses on the extensively studied serine DNA invertases that are stringently controlled by the Fis-bound enhancer regulatory system. The first section summarizes biological features and general properties of inversion reactions by the Fis/enhancer-dependent serine invertases and the recently described serine DNA invertases in Bacteroides. Mechanistic studies of reactions catalyzed by the Hin and Gin invertases are then discussed in more depth, particularly with regards to recent advances in our understanding of the function of the Fis/enhancer regulatory system, the assembly of the active recombination complex (invertasome) containing the Fis/enhancer, and the process of DNA strand exchange by rotation of synapsed subunit pairs within the invertasome. The role of DNA topological forces that function in concert with the Fis/enhancer controlling element in specifying the overwhelming bias for DNA inversion over deletion and intermolecular recombination is emphasized. PMID:25844275

  3. Ubiquinol decreases monocytic expression and DNA methylation of the pro-inflammatory chemokine ligand 2 gene in humans

    Directory of Open Access Journals (Sweden)

    Fischer Alexandra

    2012-10-01

    Full Text Available Abstract Background Coenzyme Q10 is an essential cofactor in the respiratory chain and serves in its reduced form, ubiquinol, as a potent antioxidant. Studies in vitro and in vivo provide evidence that ubiquinol reduces inflammatory processes via gene expression. Here we investigate the putative link between expression and DNA methylation of ubiquinol sensitive genes in monocytes obtained from human volunteers supplemented with 150 mg/ day ubiquinol for 14 days. Findings Ubiquinol decreases the expression of the pro-inflammatory chemokine (C-X-C motif ligand 2 gene (CXCL2 more than 10-fold. Bisulfite-/ MALDI-TOF-based analysis of regulatory regions of the CXCL2 gene identified six adjacent CpG islands which showed a 3.4-fold decrease of methylation status after ubiquinol supplementation. This effect seems to be rather gene specific, because ubiquinol reduced the expression of two other pro-inflammatory genes (PMAIP1, MMD without changing the methylation pattern of the respective gene. Conclusion In conclusion, ubiquinol decreases monocytic expression and DNA methylation of the pro-inflammatory CXCL2 gene in humans. Current Controlled Trials ISRCTN26780329.

  4. An experimental test of a fundamental food web motif.

    Science.gov (United States)

    Rip, Jason M K; McCann, Kevin S; Lynn, Denis H; Fawcett, Sonia

    2010-06-07

    Large-scale changes to the world's ecosystem are resulting in the deterioration of biostructure-the complex web of species interactions that make up ecological communities. A difficult, yet crucial task is to identify food web structures, or food web motifs, that are the building blocks of this baroque network of interactions. Once identified, these food web motifs can then be examined through experiments and theory to provide mechanistic explanations for how structure governs ecosystem stability. Here, we synthesize recent ecological research to show that generalist consumers coupling resources with different interaction strengths, is one such motif. This motif amazingly occurs across an enormous range of spatial scales, and so acts to distribute coupled weak and strong interactions throughout food webs. We then perform an experiment that illustrates the importance of this motif to ecological stability. We find that weak interactions coupled to strong interactions by generalist consumers dampen strong interaction strengths and increase community stability. This study takes a critical step by isolating a common food web motif and through clear, experimental manipulation, identifies the fundamental stabilizing consequences of this structure for ecological communities.

  5. When gene medication is also genetic modification--regulating DNA treatment.

    Science.gov (United States)

    Foss, Grethe S; Rogne, Sissel

    2007-07-26

    The molecular methods used in DNA vaccination and gene therapy resemble in many ways the methods applied in genetic modification of organisms. In some regulatory regimes, this creates an overlap between 'gene medication' and genetic modification. In Norway, an animal injected with plasmid DNA, in the form of DNA vaccine or gene therapy, currently is viewed as being genetically modified for as long as the added DNA is present in the animal. However, regulating a DNA-vaccinated animal as genetically modified creates both regulatory and practical challenges. It is also counter-intuitive to many biologists. Since immune responses can be elicited also to alter traits, the borderline between vaccination and the modification of properties is no longer distinct. In this paper, we discuss the background for the Norwegian interpretation and ways in which the regulatory challenge can be handled.

  6. Highly scalable Ab initio genomic motif identification

    KAUST Repository

    Marchand, Benoit; Bajic, Vladimir B.; Kaushik, Dinesh

    2011-01-01

    We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time. Copyright 2011 ACM.

  7. Mechanisms of zero-lag synchronization in cortical motifs.

    Directory of Open Access Journals (Sweden)

    Leonardo L Gollo

    2014-04-01

    Full Text Available Zero-lag synchronization between distant cortical areas has been observed in a diversity of experimental data sets and between many different regions of the brain. Several computational mechanisms have been proposed to account for such isochronous synchronization in the presence of long conduction delays: Of these, the phenomenon of "dynamical relaying"--a mechanism that relies on a specific network motif--has proven to be the most robust with respect to parameter mismatch and system noise. Surprisingly, despite a contrary belief in the community, the common driving motif is an unreliable means of establishing zero-lag synchrony. Although dynamical relaying has been validated in empirical and computational studies, the deeper dynamical mechanisms and comparison to dynamics on other motifs is lacking. By systematically comparing synchronization on a variety of small motifs, we establish that the presence of a single reciprocally connected pair--a "resonance pair"--plays a crucial role in disambiguating those motifs that foster zero-lag synchrony in the presence of conduction delays (such as dynamical relaying from those that do not (such as the common driving triad. Remarkably, minor structural changes to the common driving motif that incorporate a reciprocal pair recover robust zero-lag synchrony. The findings are observed in computational models of spiking neurons, populations of spiking neurons and neural mass models, and arise whether the oscillatory systems are periodic, chaotic, noise-free or driven by stochastic inputs. The influence of the resonance pair is also robust to parameter mismatch and asymmetrical time delays amongst the elements of the motif. We call this manner of facilitating zero-lag synchrony resonance-induced synchronization, outline the conditions for its occurrence, and propose that it may be a general mechanism to promote zero-lag synchrony in the brain.

  8. Prenatal exposure of mice to diethylstilbestrol disrupts T-cell differentiation by regulating Fas/Fas ligand expression through estrogen receptor element and nuclear factor-κB motifs.

    Science.gov (United States)

    Singh, Narendra P; Singh, Udai P; Nagarkatti, Prakash S; Nagarkatti, Mitzi

    2012-11-01

    Prenatal exposure to diethylstilbestrol (DES) is known to cause altered immune functions and increased susceptibility to autoimmune disease in humans. In the current study, we investigated the effect of prenatal exposure to DES on thymocyte differentiation involving apoptotic pathways. Prenatal DES exposure caused thymic atrophy, apoptosis, and up-regulation of Fas and Fas ligand (FasL) expression in thymocytes. To examine the mechanism underlying DES-mediated regulation of Fas and FasL, we performed luciferase assays using T cells transfected with luciferase reporter constructs containing full-length Fas or FasL promoters. There was significant luciferase induction in the presence of Fas or FasL promoters after DES exposure. Further analysis demonstrated the presence of several cis-regulatory motifs on both Fas and FasL promoters. When DES-induced transcription factors were analyzed, estrogen receptor element (ERE), nuclear factor κB (NF-κB), nuclear factor of activated T cells (NF-AT), and activator protein-1 motifs on the Fas promoter, as well as ERE, NF-κB, and NF-AT motifs on the FasL promoter, showed binding affinity with the transcription factors. Electrophoretic mobility-shift assays were performed to verify the binding affinity of cis-regulatory motifs of Fas or FasL promoters with transcription factors. There was shift in mobility of probes (ERE or NF-κB2) of both Fas and FasL in the presence of nuclear proteins from DES-treated cells, and the shift was specific to DES because these probes failed to shift their mobility in the presence of nuclear proteins from vehicle-treated cells. Together, the current study demonstrates that prenatal exposure to DES triggers significant alterations in apoptotic molecules expressed on thymocytes, which may affect T-cell differentiation and cause long-term effects on the immune functions.

  9. Expression, purification and characterization of hepatitis B virus X protein BH3-like motif-linker-Bcl-xL fusion protein for structural studies

    Directory of Open Access Journals (Sweden)

    Hideki Kusunoki

    2017-03-01

    Full Text Available Hepatitis B virus X protein (HBx is a multifunctional protein that interacts directly with many host proteins. For example, HBx interacts with anti-apoptotic proteins, Bcl-2 and Bcl-xL, through its BH3-like motif, which leads to elevated cytosolic calcium levels, efficient viral DNA replication and the induction of apoptosis. To facilitate sample preparation and perform detailed structural characterization of the complex between HBx and Bcl-xL, we designed and purified a recombinant HBx BH3-like motif-linker-Bcl-xL fusion protein produced in E. coli. The fusion protein was characterized by size exclusion chromatography, circular dichroism and nuclear magnetic resonance experiments. Our results show that the fusion protein is a monomer in aqueous solution, forms a stable intramolecular complex, and likely retains the native conformation of the complex between Bcl-xL and the HBx BH3-like motif. Furthermore, the HBx BH3-like motif of the intramolecular complex forms an α-helix. These observations indicate that the fusion protein should facilitate structural studies aimed at understanding the interaction between HBx and Bcl-xL at the atomic level.

  10. Transduction motif analysis of gastric cancer based on a human signaling network

    Energy Technology Data Exchange (ETDEWEB)

    Liu, G.; Li, D.Z.; Jiang, C.S.; Wang, W. [Fuzhou General Hospital of Nanjing Command, Department of Gastroenterology, Fuzhou, China, Department of Gastroenterology, Fuzhou General Hospital of Nanjing Command, Fuzhou (China)

    2014-04-04

    To investigate signal regulation models of gastric cancer, databases and literature were used to construct the signaling network in humans. Topological characteristics of the network were analyzed by CytoScape. After marking gastric cancer-related genes extracted from the CancerResource, GeneRIF, and COSMIC databases, the FANMOD software was used for the mining of gastric cancer-related motifs in a network with three vertices. The significant motif difference method was adopted to identify significantly different motifs in the normal and cancer states. Finally, we conducted a series of analyses of the significantly different motifs, including gene ontology, function annotation of genes, and model classification. A human signaling network was constructed, with 1643 nodes and 5089 regulating interactions. The network was configured to have the characteristics of other biological networks. There were 57,942 motifs marked with gastric cancer-related genes out of a total of 69,492 motifs, and 264 motifs were selected as significantly different motifs by calculating the significant motif difference (SMD) scores. Genes in significantly different motifs were mainly enriched in functions associated with cancer genesis, such as regulation of cell death, amino acid phosphorylation of proteins, and intracellular signaling cascades. The top five significantly different motifs were mainly cascade and positive feedback types. Almost all genes in the five motifs were cancer related, including EPOR, MAPK14, BCL2L1, KRT18, PTPN6, CASP3, TGFBR2, AR, and CASP7. The development of cancer might be curbed by inhibiting signal transductions upstream and downstream of the selected motifs.

  11. Armadillo motifs involved in vesicular transport.

    Directory of Open Access Journals (Sweden)

    Harald Striegl

    Full Text Available Armadillo (ARM repeat proteins function in various cellular processes including vesicular transport and membrane tethering. They contain an imperfect repeating sequence motif that forms a conserved three-dimensional structure. Recently, structural and functional insight into tethering mediated by the ARM-repeat protein p115 has been provided. Here we describe the p115 ARM-motifs for reasons of clarity and nomenclature and show that both sequence and structure are highly conserved among ARM-repeat proteins. We argue that there is no need to invoke repeat types other than ARM repeats for a proper description of the structure of the p115 globular head region. Additionally, we propose to define a new subfamily of ARM-like proteins and show lack of evidence that the ARM motifs found in p115 are present in other long coiled-coil tethering factors of the golgin family.

  12. A tandem sequence motif acts as a distance-dependent enhancer in a set of genes involved in translation by binding the proteins NonO and SFPQ

    Directory of Open Access Journals (Sweden)

    Roepcke Stefan

    2011-12-01

    Full Text Available Abstract Background Bioinformatic analyses of expression control sequences in promoters of co-expressed or functionally related genes enable the discovery of common regulatory sequence motifs that might be involved in co-ordinated gene expression. By studying promoter sequences of the human ribosomal protein genes we recently identified a novel highly specific Localized Tandem Sequence Motif (LTSM. In this work we sought to identify additional genes and LTSM-binding proteins to elucidate potential regulatory mechanisms. Results Genome-wide analyses allowed finding a considerable number of additional LTSM-positive genes, the products of which are involved in translation, among them, translation initiation and elongation factors, and 5S rRNA. Electromobility shift assays then showed specific signals demonstrating the binding of protein complexes to LTSM in ribosomal protein gene promoters. Pull-down assays with LTSM-containing oligonucleotides and subsequent mass spectrometric analysis identified the related multifunctional nucleotide binding proteins NonO and SFPQ in the binding complex. Functional characterization then revealed that LTSM enhances the transcriptional activity of the promoters in dependency of the distance from the transcription start site. Conclusions Our data demonstrate the power of bioinformatic analyses for the identification of biologically relevant sequence motifs. LTSM and the here found LTSM-binding proteins NonO and SFPQ were discovered through a synergistic combination of bioinformatic and biochemical methods and are regulators of the expression of a set of genes of the translational apparatus in a distance-dependent manner.

  13. Characterizing Motif Dynamics of Electric Brain Activity Using Symbolic Analysis

    Directory of Open Access Journals (Sweden)

    Massimiliano Zanin

    2014-10-01

    Full Text Available Motifs are small recurring circuits of interactions which constitute the backbone of networked systems. Characterizing motif dynamics is therefore key to understanding the functioning of such systems. Here we propose a method to define and quantify the temporal variability and time scales of electroencephalogram (EEG motifs of resting brain activity. Given a triplet of EEG sensors, links between them are calculated by means of linear correlation; each pattern of links (i.e., each motif is then associated to a symbol, and its appearance frequency is analyzed by means of Shannon entropy. Our results show that each motif becomes observable with different coupling thresholds and evolves at its own time scale, with fronto-temporal sensors emerging at high thresholds and changing at fast time scales, and parietal ones at low thresholds and changing at slower rates. Finally, while motif dynamics differed across individuals, for each subject, it showed robustness across experimental conditions, indicating that it could represent an individual dynamical signature.

  14. Discriminative motif discovery via simulated evolution and random under-sampling.

    Directory of Open Access Journals (Sweden)

    Tao Song

    Full Text Available Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.

  15. Discriminative motif discovery via simulated evolution and random under-sampling.

    Science.gov (United States)

    Song, Tao; Gu, Hong

    2014-01-01

    Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.

  16. DNA Packaging by λ-Like Bacteriophages: Mutations Broadening the Packaging Specificity of Terminase, the λ-Packaging Enzyme

    OpenAIRE

    Feiss, Michael; Reynolds, Erin; Schrock, Morgan; Sippy, Jean

    2010-01-01

    The DNA-packaging specificities of phages λ and 21 depend on the specific DNA interactions of the small terminase subunits, which have support helix-turn-recognition helix-wing DNA-binding motifs. λ-Terminase with the recognition helix of 21 preferentially packages 21 DNA. This chimeric terminase's ability to package λDNA is reduced ∼20-fold. Phage λ with the chimeric terminase is unable to form plaques, but pseudorevertants are readily obtained. Some pseudorevertants have trans-acting suppre...

  17. The ARTT motif and a unified structural understanding of substraterecognition in ADP ribosylating bacterial toxins and eukaryotic ADPribosyltransferases

    Energy Technology Data Exchange (ETDEWEB)

    Han, S.; Tainer, J.A.

    2001-08-01

    ADP-ribosylation is a widely occurring and biologically critical covalent chemical modification process in pathogenic mechanisms, intracellular signaling systems, DNA repair, and cell division. The reaction is catalyzed by ADP-ribosyltransferases, which transfer the ADP-ribose moiety of NAD to a target protein with nicotinamide release. A family of bacterial toxins and eukaryotic enzymes has been termed the mono-ADP-ribosyltransferases, in distinction to the poly-ADP-ribosyltransferases, which catalyze the addition of multiple ADP-ribose groups to the carboxyl terminus of eukaryotic nucleoproteins. Despite the limited primary sequence homology among the different ADP-ribosyltransferases, a central cleft bearing NAD-binding pocket formed by the two perpendicular b-sheet core has been remarkably conserved between bacterial toxins and eukaryotic mono- and poly-ADP-ribosyltransferases. The majority of bacterial toxins and eukaryotic mono-ADP-ribosyltransferases are characterized by conserved His and catalytic Glu residues. In contrast, Diphtheria toxin, Pseudomonas exotoxin A, and eukaryotic poly-ADP-ribosyltransferases are characterized by conserved Arg and catalytic Glu residues. The NAD-binding core of a binary toxin and a C3-like toxin family identified an ARTT motif (ADP-ribosylating turn-turn motif) that is implicated in substrate specificity and recognition by structural and mutagenic studies. Here we apply structure-based sequence alignment and comparative structural analyses of all known structures of ADP-ribosyltransfeases to suggest that this ARTT motif is functionally important in many ADP-ribosylating enzymes that bear a NAD binding cleft as characterized by conserved Arg and catalytic Glu residues. Overall, structure-based sequence analysis reveals common core structures and conserved active sites of ADP-ribosyltransferases to support similar NAD binding mechanisms but differing mechanisms of target protein binding via sequence variations within the ARTT

  18. Recombinant DNA. Rifkin's regulatory revivalism runs riot.

    Science.gov (United States)

    David, P

    Jeremy Rifkin, activist opponent of genetic engineering, has adopted tactics of litigation, persuasion, and confrontation in his campaign to halt genetic experimentation. The Recombinant DNA Advisory Committee of the National Institutes of Health has often been the target of his criticism, most recently for its failure to prepare an environmental risk assessment for some DNA tests it approved. Rifkin has won support for his position from religious organizations in the United States, and in June 1983 persuaded an ecumenical group of religious leaders to ask Congress to ban genetic experiments that would affect the human germ line.

  19. Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response

    DEFF Research Database (Denmark)

    Beli, Petra; Lukashchuk, Natalia; Wagner, Sebastian A

    2012-01-01

    /ATR/DNA-PK target consensus motif, suggesting an important role of downstream kinases in amplifying DDR signals. We show that the splicing-regulator phosphatase PPM1G is recruited to sites of DNA damage, while the splicing-associated protein THRAP3 is excluded from these regions. Moreover, THRAP3 depletion causes...

  20. Phyloproteomic Analysis of 11780 Six-Residue-Long Motifs Occurrences

    Directory of Open Access Journals (Sweden)

    O. V. Galzitskaya

    2015-01-01

    Full Text Available How is it possible to find good traits for phylogenetic reconstructions? Here, we present a new phyloproteomic criterion that is an occurrence of simple motifs which can be imprints of evolution history. We studied the occurrences of 11780 six-residue-long motifs consisting of two randomly located amino acids in 97 eukaryotic and 25 bacterial proteomes. For all eukaryotic proteomes, with the exception of the Amoebozoa, Stramenopiles, and Diplomonadida kingdoms, the number of proteins containing the motifs from the first group (one of the two amino acids occurs once at the terminal position made about 20%; in the case of motifs from the second (one of two amino acids occurs one time within the pattern and third (the two amino acids occur randomly groups, 30% and 50%, respectively. For bacterial proteomes, this relationship was 10%, 27%, and 63%, respectively. The matrices of correlation coefficients between numbers of proteins where a motif from the set of 11780 motifs appears at least once in 9 kingdoms and 5 phyla of bacteria were calculated. Among the correlation coefficients for eukaryotic proteomes, the correlation between the animal and fungi kingdoms (0.62 is higher than between fungi and plants (0.54. Our study provides support that animals and fungi are sibling kingdoms. Comparison of the frequencies of six-residue-long motifs in different proteomes allows obtaining phylogenetic relationships based on similarities between these frequencies: the Diplomonadida kingdoms are more close to Bacteria than to Eukaryota; Stramenopiles and Amoebozoa are more close to each other than to other kingdoms of Eukaryota.

  1. Interaction of Cu(+) with cytosine and formation of i-motif-like C-M(+)-C complexes: alkali versus coinage metals.

    Science.gov (United States)

    Gao, Juehan; Berden, Giel; Rodgers, M T; Oomens, Jos

    2016-03-14

    The Watson-Crick structure of DNA is among the most well-known molecular structures of our time. However, alternative base-pairing motifs are also known to occur, often depending on base sequence, pH, or the presence of cations. Pairing of cytosine (C) bases induced by the sharing of a single proton (C-H(+)-C) may give rise to the so-called i-motif, which occurs primarily in expanded trinucleotide repeats and the telomeric region of DNA, particularly at low pH. At physiological pH, silver cations were recently found to stabilize C dimers in a C-Ag(+)-C structure analogous to the hemiprotonated C-dimer. Here we use infrared ion spectroscopy in combination with density functional theory calculations at the B3LYP/6-311G+(2df,2p) level to show that copper in the 1+ oxidation state induces an analogous formation of C-Cu(+)-C structures. In contrast to protons and these transition metal ions, alkali metal ions induce a different dimer structure, where each ligand coordinates the alkali metal ion in a bidentate fashion in which the N3 and O2 atoms of both cytosine ligands coordinate to the metal ion, sacrificing hydrogen-bonding interactions between the ligands for improved chelation of the metal cation.

  2. Deciphering RNA Regulatory Elements Involved in the Developmental and Environmental Gene Regulation of Trypanosoma brucei.

    Science.gov (United States)

    Gazestani, Vahid H; Salavati, Reza

    2015-01-01

    Trypanosoma brucei is a vector-borne parasite with intricate life cycle that can cause serious diseases in humans and animals. This pathogen relies on fine regulation of gene expression to respond and adapt to variable environments, with implications in transmission and infectivity. However, the involved regulatory elements and their mechanisms of actions are largely unknown. Here, benefiting from a new graph-based approach for finding functional regulatory elements in RNA (GRAFFER), we have predicted 88 new RNA regulatory elements that are potentially involved in the gene regulatory network of T. brucei. We show that many of these newly predicted elements are responsive to both transcriptomic and proteomic changes during the life cycle of the parasite. Moreover, we found that 11 of predicted elements strikingly resemble previously identified regulatory elements for the parasite. Additionally, comparison with previously predicted motifs on T. brucei suggested the superior performance of our approach based on the current limited knowledge of regulatory elements in T. brucei.

  3. Bi-directional routing of DNA mismatch repair protein human exonuclease 1 to replication foci and DNA double strand breaks

    DEFF Research Database (Denmark)

    Liberti, Sascha E; Andersen, Sofie Dabros; Wang, Jing

    2011-01-01

    (PIP-box) region on hEXO1 located in its COOH-terminal ((788)QIKLNELW(795)). This motif is essential for PCNA binding and co-localization during S-phase. Recruitment of hEXO1 to DNA DSB sites is dependent on the MMR protein hMLH1. We show that two distinct hMLH1 interaction regions of hEXO1 (residues...

  4. Two sequence motifs from HIF-1α bind to the DNA-binding site of p53

    OpenAIRE

    Hansson, Lars O.; Friedler, Assaf; Freund, Stefan; Rüdiger, Stefan; Fersht, Alan R.

    2002-01-01

    There is evidence that hypoxia-inducible factor-1α (HIF-1α) interacts with the tumor suppressor p53. To characterize the putative interaction, we mapped the binding of the core domain of p53 (p53c) to an array of immobilized HIF-1α-derived peptides and found two peptide-sequence motifs that bound to p53c with micromolar affinity in solution. One sequence was adjacent to and the other coincided with the two proline residues of the oxygen-dependent degradation domain (P402 and P564) that act as...

  5. Methods and statistics for combining motif match scores.

    Science.gov (United States)

    Bailey, T L; Gribskov, M

    1998-01-01

    Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.

  6. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

    Science.gov (United States)

    Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2017-01-04

    The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Identification of the Raptor-binding motif on Arabidopsis S6 kinase and its use as a TOR signaling suppressor

    Energy Technology Data Exchange (ETDEWEB)

    Son, Ora; Kim, Sunghan; Hur, Yoon-Sun; Cheon, Choong-Ill, E-mail: ccheon@sookmyung.ac.kr

    2016-03-25

    TOR (target of rapamycin) kinase signaling plays central role as a regulator of growth and proliferation in all eukaryotic cells and its key signaling components and effectors are also conserved in plants. Unlike the mammalian and yeast counterparts, however, we found through yeast two-hybrid analysis that multiple regions of the Arabidopsis Raptor (regulatory associated protein of TOR) are required for binding to its substrate. We also identified that a 44-amino acid region at the N-terminal end of Arabidopsis ribosomal S6 kinase 1 (AtS6K1) specifically interacted with AtRaptor1, indicating that this region may contain a functional equivalent of the TOS (TOR-Signaling) motif present in the mammalian TOR substrates. Transient over-expression of this 44-amino acid fragment in Arabidopsis protoplasts resulted in significant decrease in rDNA transcription, demonstrating a feasibility of developing a new plant-specific TOR signaling inhibitor based upon perturbation of the Raptor-substrate interaction. - Highlights: • Multiple regions on the Arabidopsis Raptor protein were found to be involved in substrate binding. • N-terminal end of the Arabidopsis ribosomal S6 kinase 1 (AtS6K1) was responsible for interacting with AtRaptor1. • The Raptor-interacting fragment of AtS6K1 could be utilized as an effective inhibitor of plant TOR signaling.

  8. Identification of the Raptor-binding motif on Arabidopsis S6 kinase and its use as a TOR signaling suppressor

    International Nuclear Information System (INIS)

    Son, Ora; Kim, Sunghan; Hur, Yoon-Sun; Cheon, Choong-Ill

    2016-01-01

    TOR (target of rapamycin) kinase signaling plays central role as a regulator of growth and proliferation in all eukaryotic cells and its key signaling components and effectors are also conserved in plants. Unlike the mammalian and yeast counterparts, however, we found through yeast two-hybrid analysis that multiple regions of the Arabidopsis Raptor (regulatory associated protein of TOR) are required for binding to its substrate. We also identified that a 44-amino acid region at the N-terminal end of Arabidopsis ribosomal S6 kinase 1 (AtS6K1) specifically interacted with AtRaptor1, indicating that this region may contain a functional equivalent of the TOS (TOR-Signaling) motif present in the mammalian TOR substrates. Transient over-expression of this 44-amino acid fragment in Arabidopsis protoplasts resulted in significant decrease in rDNA transcription, demonstrating a feasibility of developing a new plant-specific TOR signaling inhibitor based upon perturbation of the Raptor-substrate interaction. - Highlights: • Multiple regions on the Arabidopsis Raptor protein were found to be involved in substrate binding. • N-terminal end of the Arabidopsis ribosomal S6 kinase 1 (AtS6K1) was responsible for interacting with AtRaptor1. • The Raptor-interacting fragment of AtS6K1 could be utilized as an effective inhibitor of plant TOR signaling.

  9. Analysis of the DNA-Binding Activities of the Arabidopsis R2R3-MYB Transcription Factor Family by One-Hybrid Experiments in Yeast.

    Directory of Open Access Journals (Sweden)

    Zsolt Kelemen

    Full Text Available The control of growth and development of all living organisms is a complex and dynamic process that requires the harmonious expression of numerous genes. Gene expression is mainly controlled by the activity of sequence-specific DNA binding proteins called transcription factors (TFs. Amongst the various classes of eukaryotic TFs, the MYB superfamily is one of the largest and most diverse, and it has considerably expanded in the plant kingdom. R2R3-MYBs have been extensively studied over the last 15 years. However, DNA-binding specificity has been characterized for only a small subset of these proteins. Therefore, one of the remaining challenges is the exhaustive characterization of the DNA-binding specificity of all R2R3-MYB proteins. In this study, we have developed a library of Arabidopsis thaliana R2R3-MYB open reading frames, whose DNA-binding activities were assayed in vivo (yeast one-hybrid experiments with a pool of selected cis-regulatory elements. Altogether 1904 interactions were assayed leading to the discovery of specific patterns of interactions between the various R2R3-MYB subgroups and their DNA target sequences and to the identification of key features that govern these interactions. The present work provides a comprehensive in vivo analysis of R2R3-MYB binding activities that should help in predicting new DNA motifs and identifying new putative target genes for each member of this very large family of TFs. In a broader perspective, the generated data will help to better understand how TF interact with their target DNA sequences.

  10. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    Science.gov (United States)

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.