infer cis-regulatory motifs: Topics by WorldWideScience.org

Sample records for infer cis-regulatory motifs

Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

Science.gov (United States)

Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

2013-09-02

In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome
On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

Science.gov (United States)

Tarpine, Ryan; Istrail, Sorin

The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.
An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

Science.gov (United States)

Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

2016-08-09

Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

Science.gov (United States)

Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2014-02-17

As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of
Bounded search for de novo identification of degenerate cis-regulatory elements

Directory of Open Access Journals (Sweden)

Khetani Radhika S

2006-05-01

Full Text Available Abstract Background The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach. Results In this paper, we report PRISM, a degenerate motif finder that leverages the relationship between the statistical significance of a set of binding sites and that of the individual binding sites. PRISM first identifies overrepresented, non-degenerate consensus motifs, then iteratively relaxes each one into a high-scoring degenerate motif. This approach requires no tunable parameters, thereby lending itself to unbiased performance comparisons. We therefore compare PRISM's performance against nine popular motif finders on 28 well-characterized S. cerevisiae regulons. PRISM consistently outperforms all other programs. Finally, we use PRISM to predict the binding sites of uncharacterized regulons. Our results support a proposed mechanism of action for the yeast cell-cycle transcription factor Stb1, whose binding site has not been determined experimentally. Conclusion The relationship between statistical measures of the binding sites and the set as a whole leads to a simple means of identifying the diverse range of cis-regulatory elements to which a protein binds. This approach leverages the advantages of word-counting, in that position dependencies are implicitly accounted for and local optima are more easily avoided. While we sacrifice guaranteed optimality to prevent the exponential blowup of exhaustive search, we prove that the error
Statistical significance of cis-regulatory modules

Directory of Open Access Journals (Sweden)

Smith Andrew D

2007-01-01

Full Text Available Abstract Background It is becoming increasingly important for researchers to be able to scan through large genomic regions for transcription factor binding sites or clusters of binding sites forming cis-regulatory modules. Correspondingly, there has been a push to develop algorithms for the rapid detection and assessment of cis-regulatory modules. While various algorithms for this purpose have been introduced, most are not well suited for rapid, genome scale scanning. Results We introduce methods designed for the detection and statistical evaluation of cis-regulatory modules, modeled as either clusters of individual binding sites or as combinations of sites with constrained organization. In order to determine the statistical significance of module sites, we first need a method to determine the statistical significance of single transcription factor binding site matches. We introduce a straightforward method of estimating the statistical significance of single site matches using a database of known promoters to produce data structures that can be used to estimate p-values for binding site matches. We next introduce a technique to calculate the statistical significance of the arrangement of binding sites within a module using a max-gap model. If the module scanned for has defined organizational parameters, the probability of the module is corrected to account for organizational constraints. The statistical significance of single site matches and the architecture of sites within the module can be combined to provide an overall estimation of statistical significance of cis-regulatory module sites. Conclusion The methods introduced in this paper allow for the detection and statistical evaluation of single transcription factor binding sites and cis-regulatory modules. The features described are implemented in the Search Tool for Occurrences of Regulatory Motifs (STORM and MODSTORM software.
PlantCARE, a plant cis-acting regulatory element database

OpenAIRE

Rombauts, Stephane; Déhais, Patrice; Van Montagu, Marc; Rouzé, Pierre

1999-01-01

PlantCARE is a database of plant cis- acting regulatory elements, enhancers and repressors. Besides the transcription motifs found on a sequence, it also offers a link to the EMBL entry that contains the full gene sequence as well as a description of the conditions in which a motif becomes functional. The information on these sites is given by matrices, consensus and individual site sequences on particular genes, depending on the available information. PlantCARE is a relational database avail...
Using hexamers to predict cis-regulatory motifs in Drosophila

Directory of Open Access Journals (Sweden)

Kibler Dennis

2005-10-01

Full Text Available Abstract Background Cis-regulatory modules (CRMs are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered. Results We present a simple, efficient method (HexDiff based only on hexamer frequencies of known CRMs and non-CRM sequence to predict novel CRMs in regulatory systems. On a data set of 16 gap and pair-rule genes containing 52 known CRMs, predictions made by HexDiff had a higher correlation with the known CRMs than several existing CRM prediction algorithms: Ahab, Cluster Buster, MSCAN, MCAST, and LWF. After combining the results of the different algorithms, 10 putative CRMs were identified and are strong candidates for future study. The hexamers used by HexDiff to distinguish between CRMs and non-CRM sequence were also analyzed and were shown to be enriched in regulatory elements. Conclusion HexDiff provides an efficient and effective means for finding new CRMs based on known CRMs, rather than known binding sites.
Identification of a cis-regulatory element by transient analysis of co-ordinately regulated genes

Directory of Open Access Journals (Sweden)

Allan Andrew C

2008-07-01

Full Text Available Abstract Background Transcription factors (TFs co-ordinately regulate target genes that are dispersed throughout the genome. This co-ordinate regulation is achieved, in part, through the interaction of transcription factors with conserved cis-regulatory motifs that are in close proximity to the target genes. While much is known about the families of transcription factors that regulate gene expression in plants, there are few well characterised cis-regulatory motifs. In Arabidopsis, over-expression of the MYB transcription factor PAP1 (PRODUCTION OF ANTHOCYANIN PIGMENT 1 leads to transgenic plants with elevated anthocyanin levels due to the co-ordinated up-regulation of genes in the anthocyanin biosynthetic pathway. In addition to the anthocyanin biosynthetic genes, there are a number of un-associated genes that also change in expression level. This may be a direct or indirect consequence of the over-expression of PAP1. Results Oligo array analysis of PAP1 over-expression Arabidopsis plants identified genes co-ordinately up-regulated in response to the elevated expression of this transcription factor. Transient assays on the promoter regions of 33 of these up-regulated genes identified eight promoter fragments that were transactivated by PAP1. Bioinformatic analysis on these promoters revealed a common cis-regulatory motif that we showed is required for PAP1 dependent transactivation. Conclusion Co-ordinated gene regulation by individual transcription factors is a complex collection of both direct and indirect effects. Transient transactivation assays provide a rapid method to identify direct target genes from indirect target genes. Bioinformatic analysis of the promoters of these direct target genes is able to locate motifs that are common to this sub-set of promoters, which is impossible to identify with the larger set of direct and indirect target genes. While this type of analysis does not prove a direct interaction between protein and DNA
Computational and molecular dissection of an X-box cis-Regulatory module

OpenAIRE

Warrington, Timothy Burton

2015-01-01

Ciliopathies are a class of human diseases marked by dysfunction of the cellular organelle, cilia. While many of the molecular components that make up cilia have been identified and studied, comparatively little is understood about the transcriptional regulation of genes encoding these components. The conserved transcription factor Regulatory Factor X (RFX)/DAF-19, which acts through binding to the cis-regulatory motif known as X-box, has been shown to regulate ciliary genes in many animals f...
Evolution of New cis-Regulatory Motifs Required for Cell-Specific Gene Expression in Caenorhabditis.

Directory of Open Access Journals (Sweden)

Michalis Barkoulas

2016-09-01

Full Text Available Patterning of C. elegans vulval cell fates relies on inductive signaling. In this induction event, a single cell, the gonadal anchor cell, secretes LIN-3/EGF and induces three out of six competent precursor cells to acquire a vulval fate. We previously showed that this developmental system is robust to a four-fold variation in lin-3/EGF genetic dose. Here using single-molecule FISH, we find that the mean level of expression of lin-3 in the anchor cell is remarkably conserved. No change in lin-3 expression level could be detected among C. elegans wild isolates and only a low level of change-less than 30%-in the Caenorhabditis genus and in Oscheius tipulae. In C. elegans, lin-3 expression in the anchor cell is known to require three transcription factor binding sites, specifically two E-boxes and a nuclear-hormone-receptor (NHR binding site. Mutation of any of these three elements in C. elegans results in a dramatic decrease in lin-3 expression. Yet only a single E-box is found in the Drosophilae supergroup of Caenorhabditis species, including C. angaria, while the NHR-binding site likely only evolved at the base of the Elegans group. We find that a transgene from C. angaria bearing a single E-box is sufficient for normal expression in C. elegans. Even a short 58 bp cis-regulatory fragment from C. angaria with this single E-box is able to replace the three transcription factor binding sites at the endogenous C. elegans lin-3 locus, resulting in the wild-type expression level. Thus, regulatory evolution occurring in cis within a 58 bp lin-3 fragment, results in a strict requirement for the NHR binding site and a second E-box in C. elegans. This single-cell, single-molecule, quantitative and functional evo-devo study demonstrates that conserved expression levels can hide extensive change in cis-regulatory site requirements and highlights the evolution of new cis-regulatory elements required for cell-specific gene expression.
In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana.

Science.gov (United States)

Ibraheem, Omodele; Botha, Christiaan E J; Bradley, Graeme

2010-12-01

The regulation of gene expression involves a multifarious regulatory system. Each gene contains a unique combination of cis-acting regulatory sequence elements in the 5' regulatory region that determines its temporal and spatial expression. Cis-acting regulatory elements are essential transcriptional gene regulatory units; they control many biological processes and stress responses. Thus a full understanding of the transcriptional gene regulation system will depend on successful functional analyses of cis-acting elements. Cis-acting regulatory elements present within the 5' regulatory region of the sucrose transporter gene families in rice (Oryza sativa Japonica cultivar-group) and Arabidopsis thaliana, were identified using a bioinformatics approach. The possible cis-acting regulatory elements were predicted by scanning 1.5kbp of 5' regulatory regions of the sucrose transporter genes translational start sites, using Plant CARE, PLACE and Genomatix Matinspector professional databases. Several cis-acting regulatory elements that are associated with plant development, plant hormonal regulation and stress response were identified, and were present in varying frequencies within the 1.5kbp of 5' regulatory region, among which are; A-box, RY, CAT, Pyrimidine-box, Sucrose-box, ABRE, ARF, ERE, GARE, Me-JA, ARE, DRE, GA-motif, GATA, GT-1, MYC, MYB, W-box, and I-box. This result reveals the probable cis-acting regulatory elements that possibly are involved in the expression and regulation of sucrose transporter gene families in rice and Arabidopsis thaliana during cellular development or environmental stress conditions. Copyright © 2010 Elsevier Ltd. All rights reserved.
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

Science.gov (United States)

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

2012-01-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks.

Science.gov (United States)

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A; Kellis, Manolis

2012-07-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.
Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

Science.gov (United States)

Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

2005-09-01

We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

Science.gov (United States)

O'Connor, Timothy R.; Bailey, Timothy L.

2014-01-01

Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088
Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA.

Science.gov (United States)

Pierstorff, Nora; Bergman, Casey M; Wiehe, Thomas

2006-12-01

Predicting cis-regulatory modules (CRMs) in higher eukaryotes is a challenging computational task. Commonly used methods to predict CRMs based on the signal of transcription factor binding sites (TFBS) are limited by prior information about transcription factor specificity. More general methods that bypass the reliance on TFBS models are needed for comprehensive CRM prediction. We have developed a method to predict CRMs called CisPlusFinder that identifies high density regions of perfect local ungapped sequences (PLUSs) based on multiple species conservation. By assuming that PLUSs contain core TFBS motifs that are locally overrepresented, the method attempts to capture the expected features of CRM structure and evolution. Applied to a benchmark dataset of CRMs involved in early Drosophila development, CisPlusFinder predicts more annotated CRMs than all other methods tested. Using the REDfly database, we find that some 'false positive' predictions in the benchmark dataset correspond to recently annotated CRMs. Our work demonstrates that CRM prediction methods that combine comparative genomic data with statistical properties of DNA may achieve reasonable performance when applied genome-wide in the absence of an a priori set of known TFBS motifs. The program CisPlusFinder can be downloaded at http://jakob.genetik.uni-koeln.de/bioinformatik/people/nora/nora.html. All software is licensed under the Lesser GNU Public License (LGPL).
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

Directory of Open Access Journals (Sweden)

Haitao Guo

2017-01-01

Full Text Available The discovery of cis-regulatory modules (CRMs is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them.
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

Science.gov (United States)

2017-01-01

The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them. PMID:28497059
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

Science.gov (United States)

De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

2015-12-01

The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

A method for selecting cis-acting regulatory sequences that respond to small molecule effectors

Directory of Open Access Journals (Sweden)

Allas Ülar

2010-08-01

Full Text Available Abstract Background Several cis-acting regulatory sequences functioning at the level of mRNA or nascent peptide and specifically influencing transcription or translation have been described. These regulatory elements often respond to specific chemicals. Results We have developed a method that allows us to select cis-acting regulatory sequences that respond to diverse chemicals. The method is based on the β-lactamase gene containing a random sequence inserted into the beginning of the ORF. Several rounds of selection are used to isolate sequences that suppress β-lactamase expression in response to the compound under study. We have isolated sequences that respond to erythromycin, troleandomycin, chloramphenicol, meta-toluate and homoserine lactone. By introducing synonymous and non-synonymous mutations we have shown that at least in the case of erythromycin the sequences act at the peptide level. We have also tested the cross-activities of the constructs and found that in most cases the sequences respond most strongly to the compound on which they were isolated. Conclusions Several selected peptides showed ligand-specific changes in amino acid frequencies, but no consensus motif could be identified. This is consistent with previous observations on natural cis-acting peptides, showing that it is often impossible to demonstrate a consensus. Applying the currently developed method on a larger scale, by selecting and comparing an extended set of sequences, might allow the sequence rules underlying the activity of cis-acting regulatory peptides to be identified.
Identification of putative cis-regulatory elements in Cryptosporidium parvum by de novo pattern finding

Directory of Open Access Journals (Sweden)

Kissinger Jessica C

2007-01-01

Full Text Available Abstract Background Cryptosporidium parvum is a unicellular eukaryote in the phylum Apicomplexa. It is an obligate intracellular parasite that causes diarrhea and is a significant AIDS-related pathogen. Cryptosporidium parvum is not amenable to long-term laboratory cultivation or classical molecular genetic analysis. The parasite exhibits a complex life cycle, a broad host range, and fundamental mechanisms of gene regulation remain unknown. We have used data from the recently sequenced genome of this organism to uncover clues about gene regulation in C. parvum. We have applied two pattern finding algorithms MEME and AlignACE to identify conserved, over-represented motifs in the 5' upstream regions of genes in C. parvum. To support our findings, we have established comparative real-time -PCR expression profiles for the groups of genes examined computationally. Results We find that groups of genes that share a function or belong to a common pathway share upstream motifs. Different motifs are conserved upstream of different groups of genes. Comparative real-time PCR studies show co-expression of genes within each group (in sub-sets during the life cycle of the parasite, suggesting co-regulation of these genes may be driven by the use of conserved upstream motifs. Conclusion This is one of the first attempts to characterize cis-regulatory elements in the absence of any previously characterized elements and with very limited expression data (seven genes only. Using de novo pattern finding algorithms, we have identified specific DNA motifs that are conserved upstream of genes belonging to the same metabolic pathway or gene family. We have demonstrated the co-expression of these genes (often in subsets using comparative real-time-PCR experiments thus establishing evidence for these conserved motifs as putative cis-regulatory elements. Given the lack of prior information concerning expression patterns and organization of promoters in C. parvum we
Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

Directory of Open Access Journals (Sweden)

Girgis Hani Z

2012-02-01

Full Text Available Abstract Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF binding sites (TFBSs. Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was
Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

Science.gov (United States)

Fauteux, François; Strömvik, Martina V

2009-01-01

Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs
Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

Directory of Open Access Journals (Sweden)

Fauteux François

2009-10-01

Full Text Available Abstract Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP gene promoters from three plant families, namely Brassicaceae (mustards, Fabaceae (legumes and Poaceae (grasses using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L. Heynh., soybean (Glycine max (L. Merr. and rice (Oryza sativa L. respectively. We have identified three conserved motifs (two RY-like and one ACGT-like in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination
RMOD: a tool for regulatory motif detection in signaling network.

Directory of Open Access Journals (Sweden)

Jinki Kim

Full Text Available Regulatory motifs are patterns of activation and inhibition that appear repeatedly in various signaling networks and that show specific regulatory properties. However, the network structures of regulatory motifs are highly diverse and complex, rendering their identification difficult. Here, we present a RMOD, a web-based system for the identification of regulatory motifs and their properties in signaling networks. RMOD finds various network structures of regulatory motifs by compressing the signaling network and detecting the compressed forms of regulatory motifs. To apply it into a large-scale signaling network, it adopts a new subgraph search algorithm using a novel data structure called path-tree, which is a tree structure composed of isomorphic graphs of query regulatory motifs. This algorithm was evaluated using various sizes of signaling networks generated from the integration of various human signaling pathways and it showed that the speed and scalability of this algorithm outperforms those of other algorithms. RMOD includes interactive analysis and auxiliary tools that make it possible to manipulate the whole processes from building signaling network and query regulatory motifs to analyzing regulatory motifs with graphical illustration and summarized descriptions. As a result, RMOD provides an integrated view of the regulatory motifs and mechanism underlying their regulatory motif activities within the signaling network. RMOD is freely accessible online at the following URL: http://pks.kaist.ac.kr/rmod.
Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

Science.gov (United States)

Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

2009-02-01

Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.
Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis.

Science.gov (United States)

Arsovski, Andrej A; Pradinuk, Julian; Guo, Xu Qiu; Wang, Sishuo; Adams, Keith L

2015-12-01

Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. © 2015 American Society of Plant Biologists. All Rights Reserved.
Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

Directory of Open Access Journals (Sweden)

Junhua Li

2014-01-01

Full Text Available Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes.
DMINDA: an integrated web server for DNA motif identification and analyses.

Science.gov (United States)

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-07-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.

Directory of Open Access Journals (Sweden)

Xiaodong Cai

Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Barcoded DNA-tag reporters for multiplex cis-regulatory analysis.

Directory of Open Access Journals (Sweden)

Jongmin Nam

Full Text Available Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of "barcoded" DNA-tag reporters, "Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs. The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics.
RegRNA: an integrated web server for identifying regulatory RNA motifs and elements

OpenAIRE

Huang, Hsi-Yuan; Chien, Chia-Hung; Jen, Kuan-Hua; Huang, Hsien-Da

2006-01-01

Numerous regulatory structural motifs have been identified as playing essential roles in transcriptional and post-transcriptional regulation of gene expression. RegRNA is an integrated web server for identifying the homologs of regulatory RNA motifs and elements against an input mRNA sequence. Both sequence homologs and structural homologs of regulatory RNA motifs can be recognized. The regulatory RNA motifs supported in RegRNA are categorized into several classes: (i) motifs in mRNA 5′-untra...
XcisClique: analysis of regulatory bicliques

Directory of Open Access Journals (Sweden)

Grene Ruth

2006-04-01

Full Text Available Abstract Background Modeling of cis-elements or regulatory motifs in promoter (upstream regions of genes is a challenging computational problem. In this work, set of regulatory motifs simultaneously present in the promoters of a set of genes is modeled as a biclique in a suitably defined bipartite graph. A biologically meaningful co-occurrence of multiple cis-elements in a gene promoter is assessed by the combined analysis of genomic and gene expression data. Greater statistical significance is associated with a set of genes that shares a common set of regulatory motifs, while simultaneously exhibiting highly correlated gene expression under given experimental conditions. Methods XcisClique, the system developed in this work, is a comprehensive infrastructure that associates annotated genome and gene expression data, models known cis-elements as regular expressions, identifies maximal bicliques in a bipartite gene-motif graph; and ranks bicliques based on their computed statistical significance. Significance is a function of the probability of occurrence of those motifs in a biclique (a hypergeometric distribution, and on the new sum of absolute values statistic (SAV that uses Spearman correlations of gene expression vectors. SAV is a statistic well-suited for this purpose as described in the discussion. Results XcisClique identifies new motif and gene combinations that might indicate as yet unidentified involvement of sets of genes in biological functions and processes. It currently supports Arabidopsis thaliana and can be adapted to other organisms, assuming the existence of annotated genomic sequences, suitable gene expression data, and identified regulatory motifs. A subset of Xcis Clique functionalities, including the motif visualization component MotifSee, source code, and supplementary material are available at https://bioinformatics.cs.vt.edu/xcisclique/.
Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

Science.gov (United States)

Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

2015-04-23

With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.
Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

Science.gov (United States)

Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

2018-04-01

The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.
cisMEP: an integrated repository of genomic epigenetic profiles and cis-regulatory modules in Drosophila.

Science.gov (United States)

Yang, Tzu-Hsien; Wang, Chung-Ching; Hung, Po-Cheng; Wu, Wei-Sheng

2014-01-01

Cis-regulatory modules (CRMs), or the DNA sequences required for regulating gene expression, play the central role in biological researches on transcriptional regulation in metazoan species. Nowadays, the systematic understanding of CRMs still mainly resorts to computational methods due to the time-consuming and small-scale nature of experimental methods. But the accuracy and reliability of different CRM prediction tools are still unclear. Without comparative cross-analysis of the results and combinatorial consideration with extra experimental information, there is no easy way to assess the confidence of the predicted CRMs. This limits the genome-wide understanding of CRMs. It is known that transcription factor binding and epigenetic profiles tend to determine functions of CRMs in gene transcriptional regulation. Thus integration of the genome-wide epigenetic profiles with systematically predicted CRMs can greatly help researchers evaluate and decipher the prediction confidence and possible transcriptional regulatory functions of these potential CRMs. However, these data are still fragmentary in the literatures. Here we performed the computational genome-wide screening for potential CRMs using different prediction tools and constructed the pioneer database, cisMEP (cis-regulatory module epigenetic profile database), to integrate these computationally identified CRMs with genomic epigenetic profile data. cisMEP collects the literature-curated TFBS location data and nine genres of epigenetic data for assessing the confidence of these potential CRMs and deciphering the possible CRM functionality. cisMEP aims to provide a user-friendly interface for researchers to assess the confidence of different potential CRMs and to understand the functions of CRMs through experimentally-identified epigenetic profiles. The deposited potential CRMs and experimental epigenetic profiles for confidence assessment provide experimentally testable hypotheses for the molecular mechanisms
Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.).

Science.gov (United States)

Yin, Tao; Wu, Hanying; Zhang, Shanglong; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming; Liu, Jingmei

2009-01-01

A 1.8 kb 5'-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from -986 to -959 and from -472 to -424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative beta-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were approximately 10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves.
Identification, occurrence, and validation of DRE and ABRE Cis-regulatory motifs in the promoter regions of genes of Arabidopsis thaliana.

Science.gov (United States)

Mishra, Sonal; Shukla, Aparna; Upadhyay, Swati; Sanchita; Sharma, Pooja; Singh, Seema; Phukan, Ujjal J; Meena, Abha; Khan, Feroz; Tripathi, Vineeta; Shukla, Rakesh Kumar; Shrama, Ashok

2014-04-01

Plants posses a complex co-regulatory network which helps them to elicit a response under diverse adverse conditions. We used an in silico approach to identify the genes with both DRE and ABRE motifs in their promoter regions in Arabidopsis thaliana. Our results showed that Arabidopsis contains a set of 2,052 genes with ABRE and DRE motifs in their promoter regions. Approximately 72% or more of the total predicted 2,052 genes had a gap distance of less than 400 bp between DRE and ABRE motifs. For positional orientation of the DRE and ABRE motifs, we found that the DR form (one in direct and the other one in reverse orientation) was more prevalent than other forms. These predicted 2,052 genes include 155 transcription factors. Using microarray data from The Arabidopsis Information Resource (TAIR) database, we present 44 transcription factors out of 155 which are upregulated by more than twofold in response to osmotic stress and ABA treatment. Fifty-one transcripts from the one predicted above were validated using semiquantitative expression analysis to support the microarray data in TAIR. Taken together, we report a set of genes containing both DRE and ABRE motifs in their promoter regions in A. thaliana, which can be useful to understand the role of ABA under osmotic stress condition. © 2013 Institute of Botany, Chinese Academy of Sciences.
Aggregation of topological motifs in the Escherichia coli transcriptional regulatory network

Directory of Open Access Journals (Sweden)

Barabási Albert-László

2004-01-01

Full Text Available Abstract Background Transcriptional regulation of cellular functions is carried out through a complex network of interactions among transcription factors and the promoter regions of genes and operons regulated by them.To better understand the system-level function of such networks simplification of their architecture was previously achieved by identifying the motifs present in the network, which are small, overrepresented, topologically distinct regulatory interaction patterns (subgraphs. However, the interaction of such motifs with each other, and their form of integration into the full network has not been previously examined. Results By studying the transcriptional regulatory network of the bacterium, Escherichia coli, we demonstrate that the two previously identified motif types in the network (i.e., feed-forward loops and bi-fan motifs do not exist in isolation, but rather aggregate into homologous motif clusters that largely overlap with known biological functions. Moreover, these clusters further coalesce into a supercluster, thus establishing distinct topological hierarchies that show global statistical properties similar to the whole network. Targeted removal of motif links disintegrates the network into small, isolated clusters, while random disruptions of equal number of links do not cause such an effect. Conclusion Individual motifs aggregate into homologous motif clusters and a supercluster forming the backbone of the E. coli transcriptional regulatory network and play a central role in defining its global topological organization.

Discovery of cell-type specific DNA motif grammar in cis-regulatory elements using random Forest.

Science.gov (United States)

Wang, Xin; Lin, Peijie; Ho, Joshua W K

2018-01-19

It has been observed that many transcription factors (TFs) can bind to different genomic loci depending on the cell type in which a TF is expressed in, even though the individual TF usually binds to the same core motif in different cell types. How a TF can bind to the genome in such a highly cell-type specific manner, is a critical research question. One hypothesis is that a TF requires co-binding of different TFs in different cell types. If this is the case, it may be possible to observe different combinations of TF motifs - a motif grammar - located at the TF binding sites in different cell types. In this study, we develop a bioinformatics method to systematically identify DNA motifs in TF binding sites across multiple cell types based on published ChIP-seq data, and address two questions: (1) can we build a machine learning classifier to predict cell-type specificity based on motif combinations alone, and (2) can we extract meaningful cell-type specific motif grammars from this classifier model. We present a Random Forest (RF) based approach to build a multi-class classifier to predict the cell-type specificity of a TF binding site given its motif content. We applied this RF classifier to two published ChIP-seq datasets of TF (TCF7L2 and MAX) across multiple cell types. Using cross-validation, we show that motif combinations alone are indeed predictive of cell types. Furthermore, we present a rule mining approach to extract the most discriminatory rules in the RF classifier, thus allowing us to discover the underlying cell-type specific motif grammar. Our bioinformatics analysis supports the hypothesis that combinatorial TF motif patterns are cell-type specific.
Pathogenic adaptation of intracellular bacteria by rewiring a cis-regulatory input function.

Science.gov (United States)

Osborne, Suzanne E; Walthers, Don; Tomljenovic, Ana M; Mulder, David T; Silphaduang, Uma; Duong, Nancy; Lowden, Michael J; Wickham, Mark E; Waller, Ross F; Kenney, Linda J; Coombes, Brian K

2009-03-10

The acquisition of DNA by horizontal gene transfer enables bacteria to adapt to previously unexploited ecological niches. Although horizontal gene transfer and mutation of protein-coding sequences are well-recognized forms of pathogen evolution, the evolutionary significance of cis-regulatory mutations in creating phenotypic diversity through altered transcriptional outputs is not known. We show the significance of regulatory mutation for pathogen evolution by mapping and then rewiring a cis-regulatory module controlling a gene required for murine typhoid. Acquisition of a binding site for the Salmonella pathogenicity island-2 regulator, SsrB, enabled the srfN gene, ancestral to the Salmonella genus, to play a role in pathoadaptation of S. typhimurium to a host animal. We identified the evolved cis-regulatory module and quantified the fitness gain that this regulatory output accrues for the bacterium using competitive infections of host animals. Our findings highlight a mechanism of pathogen evolution involving regulatory mutation that is selected because of the fitness advantage the new regulatory output provides the incipient clones.
Discovery of cis-elements between sorghum and rice using co-expression and evolutionary conservation

Directory of Open Access Journals (Sweden)

Haberer Georg

2009-06-01

Full Text Available Abstract Background The spatiotemporal regulation of gene expression largely depends on the presence and absence of cis-regulatory sites in the promoter. In the economically highly important grass family, our knowledge of transcription factor binding sites and transcriptional networks is still very limited. With the completion of the sorghum genome and the available rice genome sequence, comparative promoter analyses now allow genome-scale detection of conserved cis-elements. Results In this study, we identified thousands of phylogenetic footprints conserved between orthologous rice and sorghum upstream regions that are supported by co-expression information derived from three different rice expression data sets. In a complementary approach, cis-motifs were discovered by their highly conserved co-occurrence in syntenic promoter pairs. Sequence conservation and matches to known plant motifs support our findings. Expression similarities of gene pairs positively correlate with the number of motifs that are shared by gene pairs and corroborate the importance of similar promoter architectures for concerted regulation. This strongly suggests that these motifs function in the regulation of transcript levels in rice and, presumably also in sorghum. Conclusion Our work provides the first large-scale collection of cis-elements for rice and sorghum and can serve as a paradigm for cis-element analysis through comparative genomics in grasses in general.
Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

Science.gov (United States)

Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

2011-01-01

We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875
Validation of skeletal muscle cis-regulatory module predictions reveals nucleotide composition bias in functional enhancers.

Directory of Open Access Journals (Sweden)

Andrew T Kwon

2011-12-01

Full Text Available We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions.
A cis-regulatory sequence driving metabolic insecticide resistance in mosquitoes: functional characterisation and signatures of selection.

Science.gov (United States)

Wilding, Craig S; Smith, Ian; Lynd, Amy; Yawson, Alexander Egyir; Weetman, David; Paine, Mark J I; Donnelly, Martin J

2012-09-01

Although cytochrome P450 (CYP450) enzymes are frequently up-regulated in mosquitoes resistant to insecticides, no regulatory motifs driving these expression differences with relevance to wild populations have been identified. Transposable elements (TEs) are often enriched upstream of those CYP450s involved in insecticide resistance, leading to the assumption that they contribute regulatory motifs that directly underlie the resistance phenotype. A partial CuRE1 (Culex Repetitive Element 1) transposable element is found directly upstream of CYP9M10, a cytochrome P450 implicated previously in larval resistance to permethrin in the ISOP450 strain of Culex quinquefasciatus, but is absent from the equivalent genomic region of a susceptible strain. Via expression of CYP9M10 in Escherichia coli we have now demonstrated time- and NADPH-dependant permethrin metabolism, prerequisites for confirmation of a role in metabolic resistance, and through qPCR shown that CYP9M10 is >20-fold over-expressed in ISOP450 compared to a susceptible strain. In a fluorescent reporter assay the region upstream of CYP9M10 from ISOP450 drove 10× expression compared to the equivalent region (lacking CuRE1) from the susceptible strain. Close correspondence with the gene expression fold-change implicates the upstream region including CuRE1 as a cis-regulatory element involved in resistance. Only a single CuRE1 bearing allele, identical to the CuRE1 bearing allele in the resistant strain, is found throughout Sub-Saharan Africa, in contrast to the diversity encountered in non-CuRE1 alleles. This suggests a single origin and subsequent spread due to selective advantage. CuRE1 is detectable using a simple diagnostic. When applied to C. quinquefasciatus larvae from Ghana we have demonstrated a significant association with permethrin resistance in multiple field sites (mean Odds Ratio = 3.86) suggesting this marker has relevance to natural populations of vector mosquitoes. However, when CuRE1 was excised
Identifying Cis-Regulatory Changes Involved in the Evolution of Aerobic Fermentation in Yeasts

Science.gov (United States)

Lin, Zhenguo; Wang, Tzi-Yuan; Tsai, Bing-Shi; Wu, Fang-Ting; Yu, Fu-Jung; Tseng, Yu-Jung; Sung, Huang-Mo; Li, Wen-Hsiung

2013-01-01

Gene regulation change has long been recognized as an important mechanism for phenotypic evolution. We used the evolution of yeast aerobic fermentation as a model to explore how gene regulation has evolved and how this process has contributed to phenotypic evolution and adaptation. Most eukaryotes fully oxidize glucose to CO2 and H2O in mitochondria to maximize energy yield, whereas some yeasts, such as Saccharomyces cerevisiae and its relatives, predominantly ferment glucose into ethanol even in the presence of oxygen, a phenomenon known as aerobic fermentation. We examined the genome-wide gene expression levels among 12 different yeasts and found that a group of genes involved in the mitochondrial respiration process showed the largest reduction in gene expression level during the evolution of aerobic fermentation. Our analysis revealed that the downregulation of these genes was significantly associated with massive loss of binding motifs of Cbf1p in the fermentative yeasts. Our experimental assays confirmed the binding of Cbf1p to the predicted motif and the activator role of Cbf1p. In summary, our study laid a foundation to unravel the long-time mystery about the genetic basis of evolution of aerobic fermentation, providing new insights into understanding the role of cis-regulatory changes in phenotypic evolution. PMID:23650209
The identification of functional motifs in temporal gene expression analysis

Directory of Open Access Journals (Sweden)

Michael G. Surette

2005-01-01

Full Text Available The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

Science.gov (United States)

Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

2018-05-31

In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.
A minimal murine Msx-1 gene promoter. Organization of its cis-regulatory motifs and their role in transcriptional activation in cells in culture and in transgenic mice.

Science.gov (United States)

Takahashi, T; Guron, C; Shetty, S; Matsui, H; Raghow, R

1997-09-05

To dissect the cis-regulatory elements of the murine Msx-1 promoter, which lacks a conventional TATA element, a putative Msx-1 promoter DNA fragment (from -1282 to +106 base pairs (bp)) or its congeners containing site-specific alterations were fused to luciferase reporter and introduced into NIH3T3 and C2C12 cells, and the expression of luciferase was assessed in transient expression assays. The functional consequences of the sequential 5' deletions of the promotor revealed that multiple positive and negative regulatory elements participate in regulating transcription of the Msx-1 gene. Surprisingly, however, the optimal expression of Msx-1 promoter in either NIH3T3 or C2C12 cells required only 165 bp of the upstream sequence to warrant detailed examination of its structure. Therefore, the functional consequences of site-specific deletions and point mutations of the cis-acting elements of the minimal Msx-1 promoter were systematically examined. Concomitantly, potential transcriptional factor(s) interacting with the cis-acting elements of the minimal promoter were also studied by gel electrophoretic mobility shift assays and DNase I footprinting. Combined analyses of the minimal promoter by DNase I footprinting, electrophoretic mobility shift assays, and super shift assays with specific antibodies revealed that 5'-flanking regions from -161 to -154 and from -26 to -13 of the Msx-1 promoter contains an authentic E box (proximal E box), capable of binding a protein immunologically related to the upstream stimulating factor 1 (USF-1) and a GC-rich sequence motif which can bind to Sp1 (proximal Sp1), respectively. Additionally, we observed that the promoter activation was seriously hampered if the proximal E box was removed or mutated, and the promoter activity was eliminated completely if the proximal Sp1 site was similarly altered. Absolute dependence of the Msx-1 minimal promoter on Sp1 could be demonstrated by transient expression assays in the Sp1-deficient
Positional bias of general and tissue-specific regulatory motifs in mouse gene promoters

Directory of Open Access Journals (Sweden)

Farré Domènec

2007-12-01

Full Text Available Abstract Background The arrangement of regulatory motifs in gene promoters, or promoter architecture, is the result of mutation and selection processes that have operated over many millions of years. In mammals, tissue-specific transcriptional regulation is related to the presence of specific protein-interacting DNA motifs in gene promoters. However, little is known about the relative location and spacing of these motifs. To fill this gap, we have performed a systematic search for motifs that show significant bias at specific promoter locations in a large collection of housekeeping and tissue-specific genes. Results We observe that promoters driving housekeeping gene expression are enriched in particular motifs with strong positional bias, such as YY1, which are of little relevance in promoters driving tissue-specific expression. We also identify a large number of motifs that show positional bias in genes expressed in a highly tissue-specific manner. They include well-known tissue-specific motifs, such as HNF1 and HNF4 motifs in liver, kidney and small intestine, or RFX motifs in testis, as well as many potentially novel regulatory motifs. Based on this analysis, we provide predictions for 559 tissue-specific motifs in mouse gene promoters. Conclusion The study shows that motif positional bias is an important feature of mammalian proximal promoters and that it affects both general and tissue-specific motifs. Motif positional constraints define very distinct promoter architectures depending on breadth of expression and type of tissue.
Organization of cis-acting regulatory elements in osmotic- and cold-stress-responsive promoters.

Science.gov (United States)

Yamaguchi-Shinozaki, Kazuko; Shinozaki, Kazuo

2005-02-01

cis-Acting regulatory elements are important molecular switches involved in the transcriptional regulation of a dynamic network of gene activities controlling various biological processes, including abiotic stress responses, hormone responses and developmental processes. In particular, understanding regulatory gene networks in stress response cascades depends on successful functional analyses of cis-acting elements. The ever-improving accuracy of transcriptome expression profiling has led to the identification of various combinations of cis-acting elements in the promoter regions of stress-inducible genes involved in stress and hormone responses. Here we discuss major cis-acting elements, such as the ABA-responsive element (ABRE) and the dehydration-responsive element/C-repeat (DRE/CRT), that are a vital part of ABA-dependent and ABA-independent gene expression in osmotic and cold stress responses.
Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

Science.gov (United States)

Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

2015-01-01

Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

Directory of Open Access Journals (Sweden)

Kacy L Gordon

2015-05-01

Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.
Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes

Directory of Open Access Journals (Sweden)

Kistler Corby

2010-03-01

Full Text Available Abstract Background Fusarium graminearum (Fg, a major fungal pathogen of cultivated cereals, is responsible for billions of dollars in agriculture losses. There is a growing interest in understanding the transcriptional regulation of this organism, especially the regulation of genes underlying its pathogenicity. The generation of whole genome sequence assemblies for Fg and three closely related Fusarium species provides a unique opportunity for such a study. Results Applying comparative genomics approaches, we developed a computational pipeline to systematically discover evolutionarily conserved regulatory motifs in the promoter, downstream and the intronic regions of Fg genes, based on the multiple alignments of sequenced Fusarium genomes. Using this method, we discovered 73 candidate regulatory motifs in the promoter regions. Nearly 30% of these motifs are highly enriched in promoter regions of Fg genes that are associated with a specific functional category. Through comparison to Saccharomyces cerevisiae (Sc and Schizosaccharomyces pombe (Sp, we observed conservation of transcription factors (TFs, their binding sites and the target genes regulated by these TFs related to pathways known to respond to stress conditions or phosphate metabolism. In addition, this study revealed 69 and 39 conserved motifs in the downstream regions and the intronic regions, respectively, of Fg genes. The top intronic motif is the splice donor site. For the downstream regions, we noticed an intriguing absence of the mammalian and Sc poly-adenylation signals among the list of conserved motifs. Conclusion This study provides the first comprehensive list of candidate regulatory motifs in Fg, and underscores the power of comparative genomics in revealing functional elements among related genomes. The conservation of regulatory pathways among the Fusarium genomes and the two yeast species reveals their functional significance, and provides new insights in their
A saturation screen for cis-acting regulatory DNA in the Hox genes of Ciona intestinalis

Energy Technology Data Exchange (ETDEWEB)

Keys, David N.; Lee, Byung-in; Di Gregorio, Anna; Harafuji, Naoe; Detter, Chris; Wang, Mei; Kahsai, Orsalem; Ahn, Sylvia; Arellano, Andre; Zhang, Quin; Trong, Stephan; Doyle, Sharon A.; Satoh, Noriyuki; Satou, Yutaka; Saiga, Hidetoshi; Christian, Allen; Rokhsar, Dan; Hawkins, Trevor L.; Levine, Mike; Richardson, Paul

2005-01-05

A screen for the systematic identification of cis-regulatory elements within large (>100 kb) genomic domains containing Hox genes was performed by using the basal chordate Ciona intestinalis. Randomly generated DNA fragments from bacterial artificial chromosomes containing two clusters of Hox genes were inserted into a vector upstream of a minimal promoter and lacZ reporter gene. A total of 222 resultant fusion genes were separately electroporated into fertilized eggs, and their regulatory activities were monitored in larvae. In sum, 21 separable cis-regulatory elements were found. These include eight Hox linked domains that drive expression in nested anterior-posterior domains of ectodermally derived tissues. In addition to vertebrate-like CNS regulation, the discovery of cis-regulatory domains that drive epidermal transcription suggests that C. intestinalis has arthropod-like Hox patterning in the epidermis.
Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

Energy Technology Data Exchange (ETDEWEB)

Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

2003-06-01

OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.
Coevolution within a transcriptional network by compensatory trans and cis mutations

KAUST Repository

Kuo, D.

2010-10-26

Transcriptional networks have been shown to evolve very rapidly, prompting questions as to how such changes arise and are tolerated. Recent comparisons of transcriptional networks across species have implicated variations in the cis-acting DNA sequences near genes as the main cause of divergence. What is less clear is how these changes interact with trans-acting changes occurring elsewhere in the genetic circuit. Here, we report the discovery of a system of compensatory trans and cis mutations in the yeast AP-1 transcriptional network that allows for conserved transcriptional regulation despite continued genetic change. We pinpoint a single species, the fungal pathogen Candida glabrata, in which a trans mutation has occurred very recently in a single AP-1 family member, distinguishing it from its Saccharomyces ortholog. Comparison of chromatin immunoprecipitation profiles between Candida and Saccharomyces shows that, despite their different DNA-binding domains, the AP-1 orthologs regulate a conserved block of genes. This conservation is enabled by concomitant changes in the cis-regulatory motifs upstream of each gene. Thus, both trans and cis mutations have perturbed the yeast AP-1 regulatory system in such a way as to compensate for one another. This demonstrates an example of “coevolution” between a DNA-binding transcription factor and its cis-regulatory site, reminiscent of the coevolution of protein binding partners.
Combinatorial Cis-regulation in Saccharomyces Species

Directory of Open Access Journals (Sweden)

Aaron T. Spivak

2016-03-01

Full Text Available Transcriptional control of gene expression requires interactions between the cis-regulatory elements (CREs controlling gene promoters. We developed a sensitive computational method to identify CRE combinations with conserved spacing that does not require genome alignments. When applied to seven sensu stricto and sensu lato Saccharomyces species, 80% of the predicted interactions displayed some evidence of combinatorial transcriptional behavior in several existing datasets including: (1 chromatin immunoprecipitation data for colocalization of transcription factors, (2 gene expression data for coexpression of predicted regulatory targets, and (3 gene ontology databases for common pathway membership of predicted regulatory targets. We tested several predicted CRE interactions with chromatin immunoprecipitation experiments in a wild-type strain and strains in which a predicted cofactor was deleted. Our experiments confirmed that transcription factor (TF occupancy at the promoters of the CRE combination target genes depends on the predicted cofactor while occupancy of other promoters is independent of the predicted cofactor. Our method has the additional advantage of identifying regulatory differences between species. By analyzing the S. cerevisiae and S. bayanus genomes, we identified differences in combinatorial cis-regulation between the species and showed that the predicted changes in gene regulation explain several of the species-specific differences seen in gene expression datasets. In some instances, the same CRE combinations appear to regulate genes involved in distinct biological processes in the two different species. The results of this research demonstrate that (1 combinatorial cis-regulation can be inferred by multi-genome analysis and (2 combinatorial cis-regulation can explain differences in gene expression between species.
Identification of putative regulatory motifs in the upstream regions of co-expressed functional groups of genes in Plasmodium falciparum

Directory of Open Access Journals (Sweden)

Joshi NV

2009-01-01

Full Text Available Abstract Background Regulation of gene expression in Plasmodium falciparum (Pf remains poorly understood. While over half the genes are estimated to be regulated at the transcriptional level, few regulatory motifs and transcription regulators have been found. Results The study seeks to identify putative regulatory motifs in the upstream regions of 13 functional groups of genes expressed in the intraerythrocytic developmental cycle of Pf. Three motif-discovery programs were used for the purpose, and motifs were searched for only on the gene coding strand. Four motifs – the 'G-rich', the 'C-rich', the 'TGTG' and the 'CACA' motifs – were identified, and zero to all four of these occur in the 13 sets of upstream regions. The 'CACA motif' was absent in functional groups expressed during the ring to early trophozoite transition. For functional groups expressed in each transition, the motifs tended to be similar. Upstream motifs in some functional groups showed 'positional conservation' by occurring at similar positions relative to the translational start site (TLS; this increases their significance as regulatory motifs. In the ribonucleotide synthesis, mitochondrial, proteasome and organellar translation machinery genes, G-rich, C-rich, CACA and TGTG motifs, respectively, occur with striking positional conservation. In the organellar translation machinery group, G-rich motifs occur close to the TLS. The same motifs were sometimes identified for multiple functional groups; differences in location and abundance of the motifs appear to ensure different modes of action. Conclusion The identification of positionally conserved over-represented upstream motifs throws light on putative regulatory elements for transcription in Pf.

iFORM: Incorporating Find Occurrence of Regulatory Motifs.

Science.gov (United States)

Ren, Chao; Chen, Hebing; Yang, Bite; Liu, Feng; Ouyang, Zhangyi; Bo, Xiaochen; Shu, Wenjie

2016-01-01

Accurately identifying the binding sites of transcription factors (TFs) is crucial to understanding the mechanisms of transcriptional regulation and human disease. We present incorporating Find Occurrence of Regulatory Motifs (iFORM), an easy-to-use and efficient tool for scanning DNA sequences with TF motifs described as position weight matrices (PWMs). Both performance assessment with a receiver operating characteristic (ROC) curve and a correlation-based approach demonstrated that iFORM achieves higher accuracy and sensitivity by integrating five classical motif discovery programs using Fisher's combined probability test. We have used iFORM to provide accurate results on a variety of data in the ENCODE Project and the NIH Roadmap Epigenomics Project, and the tool has demonstrated its utility in further elucidating individual roles of functional elements. Both the source and binary codes for iFORM can be freely accessed at https://github.com/wenjiegroup/iFORM. The identified TF binding sites across human cell and tissue types using iFORM have been deposited in the Gene Expression Omnibus under the accession ID GSE53962.
ChIP-Seq-Annotated Heliconius erato Genome Highlights Patterns of cis-Regulatory Evolution in Lepidoptera

Directory of Open Access Journals (Sweden)

James J. Lewis

2016-09-01

Full Text Available Uncovering phylogenetic patterns of cis-regulatory evolution remains a fundamental goal for evolutionary and developmental biology. Here, we characterize the evolution of regulatory loci in butterflies and moths using chromatin immunoprecipitation sequencing (ChIP-seq annotation of regulatory elements across three stages of head development. In the process we provide a high-quality, functionally annotated genome assembly for the butterfly, Heliconius erato. Comparing cis-regulatory element conservation across six lepidopteran genomes, we find that regulatory sequences evolve at a pace similar to that of protein-coding regions. We also observe that elements active at multiple developmental stages are markedly more conserved than elements with stage-specific activity. Surprisingly, we also find that stage-specific proximal and distal regulatory elements evolve at nearly identical rates. Our study provides a benchmark for genome-wide patterns of regulatory element evolution in insects, and it shows that developmental timing of activity strongly predicts patterns of regulatory sequence evolution.
A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction.

Science.gov (United States)

Guo, Yuchun; Tian, Kevin; Zeng, Haoyang; Guo, Xiaoyun; Gifford, David Kenneth

2018-04-13

The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k -mer set memory (KSM), which consists of a set of aligned k -mers that are overrepresented at TF binding sites, and a new method called KMAC for de novo discovery of KSMs. We find that KSMs more accurately predict in vivo binding sites than position weight matrix (PWM) models and other more complex motif models across a large set of ChIP-seq experiments. Furthermore, KSMs outperform PWMs and more complex motif models in predicting in vitro binding sites. KMAC also identifies correct motifs in more experiments than five state-of-the-art motif discovery methods. In addition, KSM-derived features outperform both PWM and deep learning model derived sequence features in predicting differential regulatory activities of expression quantitative trait loci (eQTL) alleles. Finally, we have applied KMAC to 1600 ENCODE TF ChIP-seq data sets and created a public resource of KSM and PWM motifs. We expect that the KSM representation and KMAC method will be valuable in characterizing TF binding specificities and in interpreting the effects of noncoding genetic variations. © 2018 Guo et al.; Published by Cold Spring Harbor Laboratory Press.
Nomadic enhancers: tissue-specific cis-regulatory elements of yellow have divergent genomic positions among Drosophila species.

Directory of Open Access Journals (Sweden)

Gizem Kalay

2010-11-01

Full Text Available cis-regulatory DNA sequences known as enhancers control gene expression in space and time. They are central to metazoan development and are often responsible for changes in gene regulation that contribute to phenotypic evolution. Here, we examine the sequence, function, and genomic location of enhancers controlling tissue- and cell-type specific expression of the yellow gene in six Drosophila species. yellow is required for the production of dark pigment, and its expression has evolved largely in concert with divergent pigment patterns. Using Drosophila melanogaster as a transgenic host, we examined the expression of reporter genes in which either 5' intergenic or intronic sequences of yellow from each species controlled the expression of Green Fluorescent Protein. Surprisingly, we found that sequences controlling expression in the wing veins, as well as sequences controlling expression in epidermal cells of the abdomen, thorax, and wing, were located in different genomic regions in different species. By contrast, sequences controlling expression in bristle-associated cells were located in the intron of all species. Differences in the precise pattern of spatial expression within the developing epidermis of D. melanogaster transformants usually correlated with adult pigmentation in the species from which the cis-regulatory sequences were derived, which is consistent with cis-regulatory evolution affecting yellow expression playing a central role in Drosophila pigmentation divergence. Sequence comparisons among species favored a model in which sequential nucleotide substitutions were responsible for the observed changes in cis-regulatory architecture. Taken together, these data demonstrate frequent changes in yellow cis-regulatory architecture among Drosophila species. Similar analyses of other genes, combining in vivo functional tests of enhancer activity with in silico comparative genomics, are needed to determine whether the pattern of
Nutritional control of gene expression in Drosophila larvae via TOR, Myc and a novel cis-regulatory element

Directory of Open Access Journals (Sweden)

Grewal Savraj S

2010-01-01

Full Text Available Abstract Background Nutrient availability is a key determinant of eukaryotic cell growth. In unicellular organisms many signaling and transcriptional networks link nutrient availability to the expression of metabolic genes required for growth. However, less is known about the corresponding mechanisms that operate in metazoans. We used gene expression profiling to explore this issue in developing Drosophila larvae. Results We found that starvation for dietary amino acids (AA's leads to dynamic changes in transcript levels of many metabolic genes. The conserved insulin/PI3K and TOR signaling pathways mediate nutrition-dependent growth in Drosophila and other animals. We found that many AA starvation-responsive transcripts were also altered in TOR mutants. In contrast, although PI3K overexpression induced robust changes in the expression of many metabolic genes, these changes showed limited overlap with the AA starvation expression profile. We did however identify a strong overlap between genes regulated by the transcription factor, Myc, and AA starvation-responsive genes, particularly those involved in ribosome biogenesis, protein synthesis and mitochondrial function. The consensus Myc DNA binding site is enriched in promoters of these AA starvation genes, and we found that Myc overexpression could bypass dietary AA to induce expression of these genes. We also identified another sequence motif (Motif 1 enriched in the promoters of AA starvation-responsive genes. We showed that Motif 1 was both necessary and sufficient to mediate transcriptional responses to dietary AA in larvae. Conclusions Our data suggest that many of the transcriptional effects of amino acids are mediated via signaling through the TOR pathway in Drosophila larvae. We also find that these transcriptional effects are mediated through at least two mechanisms: via the transcription factor Myc, and via the Motif 1 cis-regulatory element. These studies begin to elucidate a nutrient
Large-scale discovery of promoter motifs in Drosophila melanogaster.

Directory of Open Access Journals (Sweden)

Thomas A Down

2007-01-01

Full Text Available A key step in understanding gene regulation is to identify the repertoire of transcription factor binding motifs (TFBMs that form the building blocks of promoters and other regulatory elements. Identifying these experimentally is very laborious, and the number of TFBMs discovered remains relatively small, especially when compared with the hundreds of transcription factor genes predicted in metazoan genomes. We have used a recently developed statistical motif discovery approach, NestedMICA, to detect candidate TFBMs from a large set of Drosophila melanogaster promoter regions. Of the 120 motifs inferred in our initial analysis, 25 were statistically significant matches to previously reported motifs, while 87 appeared to be novel. Analysis of sequence conservation and motif positioning suggested that the great majority of these discovered motifs are predictive of functional elements in the genome. Many motifs showed associations with specific patterns of gene expression in the D. melanogaster embryo, and we were able to obtain confident annotation of expression patterns for 25 of our motifs, including eight of the novel motifs. The motifs are available through Tiffin, a new database of DNA sequence motifs. We have discovered many new motifs that are overrepresented in D. melanogaster promoter regions, and offer several independent lines of evidence that these are novel TFBMs. Our motif dictionary provides a solid foundation for further investigation of regulatory elements in Drosophila, and demonstrates techniques that should be applicable in other species. We suggest that further improvements in computational motif discovery should narrow the gap between the set of known motifs and the total number of transcription factors in metazoan genomes.
MotifMark: Finding regulatory motifs in DNA sequences.

Science.gov (United States)

Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

2017-07-01

The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Phylogeny based discovery of regulatory elements

Directory of Open Access Journals (Sweden)

Cohen Barak A

2006-05-01

Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.
Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element

International Nuclear Information System (INIS)

Mao, Grace; Brody, James P.

2007-01-01

Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16 bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16 bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014 s -1 . We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase
MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

Science.gov (United States)

Ozaki, Haruka; Iwasaki, Wataru

2016-08-01

As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.

Science.gov (United States)

Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N

2013-03-15

The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter
Annotating RNA motifs in sequences and alignments.

Science.gov (United States)

Gardner, Paul P; Eldai, Hisham

2015-01-01

RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure-function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs--RMfam--and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Spatiotemporal network motif reveals the biological traits of developmental gene regulatory networks in Drosophila melanogaster

Directory of Open Access Journals (Sweden)

Kim Man-Sun

2012-05-01

Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.
An enhanced computational platform for investigating the roles of regulatory RNA and for identifying functional RNA motifs

OpenAIRE

Chang, Tzu-Hao; Huang, Hsi-Yuan; Hsu, Justin Bo-Kai; Weng, Shun-Long; Horng, Jorng-Tzong; Huang, Hsien-Da

2013-01-01

Background Functional RNA molecules participate in numerous biological processes, ranging from gene regulation to protein synthesis. Analysis of functional RNA motifs and elements in RNA sequences can obtain useful information for deciphering RNA regulatory mechanisms. Our previous work, RegRNA, is widely used in the identification of regulatory motifs, and this work extends it by incorporating more comprehensive and updated data sources and analytical approaches into a new platform. Methods ...
CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors

NARCIS (Netherlands)

J.C. Corbo (Joseph); K.A. Lawrence (Karen); M. Karlstetter (Marcus); C.A. Myers (Connie); M. Abdelaziz (Musa); W. Dirkes (William); K. Weigelt (Karin); M. Seifert (Martin); V. Benes (Vladimir); L.G. Fritsche (Lars); B.H.F. Weber (Bernhard); T. Langmann (Thomas)

2010-01-01

textabstractApproximately 98% of mammalian DNA is noncoding, yet we understand relatively little about the function of this enigmatic portion of the genome. The cis-regulatory elements that control gene expression reside in noncoding regions and can be identified by mapping the binding sites of
Implications of duplicated cis-regulatory elements in the evolution of metazoans: the DDI model or how simplicity begets novelty.

Science.gov (United States)

Jiménez-Delgado, Senda; Pascual-Anaya, Juan; Garcia-Fernàndez, Jordi

2009-07-01

The discovery that most regulatory genes were conserved among animals from distant phyla challenged the ideas that gene duplication and divergence of homologous coding sequences were the basis for major morphological changes in metazoan evolution. In recent years, however, the interest for the roles, conservation and changes of non-coding sequences grew-up in parallel with genome sequencing projects. Presently, many independent studies are highlighting the importance that subtle changes in cis-regulatory regions had in the evolution of morphology trough the Animal Kingdom. Here we will show and discuss some of these studies, and underscore the future of cis-Evo-Devo research. Nevertheless, we would also explore how gene duplication, which includes duplication of regulatory regions, may have been critical for spatial or temporal co-option of new regulatory networks, causing the deployment of new transcriptome scenarios, and how these induced morphological changes were critical for the evolution of new forms. Forty years after Susumu Ohno famous sentence 'natural selection merely modifies, while redundancy creates', we suggest the alternative: 'natural selection modifies, while redundancy of cis-regulatory elements innovates', and propose the Duplication-Degeneration-Innovation model to explain the increased evolvability of duplicated cis-regulatory regions. Paradoxically, making regulation simpler by subfunctionalization paved the path for future complexity or, in other words, 'to make it simple to make it complex'.
Cis-regulatory elements in the primate brain: from functional specialization to neurodegeneration

NARCIS (Netherlands)

Vermunt, Marit W.

2017-01-01

Over the last decade, the noncoding part of the genome has been shown to harbour thousands of cis-regulatory elements, such as enhancers, that activate well-defined gene expression programs. Here, we charted active enhancers in a multiplicity of human brain regions to understand the role of
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

Directory of Open Access Journals (Sweden)

Diana S José-Edwards

2015-12-01

Full Text Available A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs, and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

Science.gov (United States)

José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

2015-12-01

A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.
Inference of cancer-specific gene regulatory networks using soft computing rules.

Science.gov (United States)

Wang, Xiaosheng; Gotoh, Osamu

2010-03-24

Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

Fused Regression for Multi-source Gene Regulatory Network Inference.

Directory of Open Access Journals (Sweden)

Kari Y Lam

2016-12-01

Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.
Systematic comparison of the response properties of protein and RNA mediated gene regulatory motifs.

Science.gov (United States)

Iyengar, Bharat Ravi; Pillai, Beena; Venkatesh, K V; Gadgil, Chetan J

2017-05-30

We present a framework enabling the dissection of the effects of motif structure (feedback or feedforward), the nature of the controller (RNA or protein), and the regulation mode (transcriptional, post-transcriptional or translational) on the response to a step change in the input. We have used a common model framework for gene expression where both motif structures have an activating input and repressing regulator, with the same set of parameters, to enable a comparison of the responses. We studied the global sensitivity of the system properties, such as steady-state gain, overshoot, peak time, and peak duration, to parameters. We find that, in all motifs, overshoot correlated negatively whereas peak duration varied concavely with peak time. Differences in the other system properties were found to be mainly dependent on the nature of the controller rather than the motif structure. Protein mediated motifs showed a higher degree of adaptation i.e. a tendency to return to baseline levels; in particular, feedforward motifs exhibited perfect adaptation. RNA mediated motifs had a mild regulatory effect; they also exhibited a lower peaking tendency and mean overshoot. Protein mediated feedforward motifs showed higher overshoot and lower peak time compared to the corresponding feedback motifs.
Plasticity of the cis-regulatory input function of a gene.

Directory of Open Access Journals (Sweden)

Avraham E Mayo

2006-04-01

Full Text Available The transcription rate of a gene is often controlled by several regulators that bind specific sites in the gene's cis-regulatory region. The combined effect of these regulators is described by a cis-regulatory input function. What determines the form of an input function, and how variable is it with respect to mutations? To address this, we employ the well-characterized lac operon of Escherichia coli, which has an elaborate input function, intermediate between Boolean AND-gate and OR-gate logic. We mapped in detail the input function of 12 variants of the lac promoter, each with different point mutations in the regulator binding sites, by means of accurate expression measurements from living cells. We find that even a few mutations can significantly change the input function, resulting in functions that resemble Pure AND gates, OR gates, or single-input switches. Other types of gates were not found. The variant input functions can be described in a unified manner by a mathematical model. The model also lets us predict which functions cannot be reached by point mutations. The input function that we studied thus appears to be plastic, in the sense that many of the mutations do not ruin the regulation completely but rather result in new ways to integrate the inputs.
Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer

Directory of Open Access Journals (Sweden)

Matloob Khushi

2014-11-01

Full Text Available Chromatin factors interact with each other in a cell and sequence-specific manner in order to regulate transcription and a wealth of publically available datasets exists describing the genomic locations of these interactions. Our recently published BiSA (Binding Sites Analyser database contains transcription factor binding locations and epigenetic modifications collected from published studies and provides tools to analyse stored and imported data. Using BiSA we investigated the overlapping cis-regulatory role of estrogen receptor alpha (ERα and progesterone receptor (PR in the T-47D breast cancer cell line. We found that ERα binding sites overlap with a subset of PR binding sites. To investigate further, we re-analysed raw data to remove any biases introduced by the use of distinct tools in the original publications. We identified 22,152 PR and 18,560 ERα binding sites (<5% false discovery rate with 4,358 overlapping regions among the two datasets. BiSA statistical analysis revealed a non-significant overall overlap correlation between the two factors, suggesting that ERα and PR are not partner factors and do not require each other for binding to occur. However, Monte Carlo simulation by Binary Interval Search (BITS, Relevant Distance, Absolute Distance, Jaccard and Projection tests by Genometricorr revealed a statistically significant spatial correlation of binding regions on chromosome between the two factors. Motif analysis revealed that the shared binding regions were enriched with binding motifs for ERα, PR and a number of other transcription and pioneer factors. Some of these factors are known to co-locate with ERα and PR binding. Therefore spatially close proximity of ERα binding sites with PR binding sites suggests that ERα and PR, in general function independently at the molecular level, but that their activities converge on a specific subset of transcriptional targets.
Inference of Cancer-specific Gene Regulatory Networks Using Soft Computing Rules

Directory of Open Access Journals (Sweden)

Xiaosheng Wang

2010-03-01

Full Text Available Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
In silico modeling of epigenetic-induced changes in photoreceptor cis-regulatory elements.

Science.gov (United States)

Hossain, Reafa A; Dunham, Nicholas R; Enke, Raymond A; Berndsen, Christopher E

2018-01-01

DNA methylation is a well-characterized epigenetic repressor of mRNA transcription in many plant and vertebrate systems. However, the mechanism of this repression is not fully understood. The process of transcription is controlled by proteins that regulate recruitment and activity of RNA polymerase by binding to specific cis-regulatory sequences. Cone-rod homeobox (CRX) is a well-characterized mammalian transcription factor that controls photoreceptor cell-specific gene expression. Although much is known about the functions and DNA binding specificity of CRX, little is known about how DNA methylation modulates CRX binding affinity to genomic cis-regulatory elements. We used bisulfite pyrosequencing of human ocular tissues to measure DNA methylation levels of the regulatory regions of RHO , PDE6B, PAX6 , and LINE1 retrotransposon repeats. To describe the molecular mechanism of repression, we used molecular modeling to illustrate the effect of DNA methylation on human RHO regulatory sequences. In this study, we demonstrate an inverse correlation between DNA methylation in regulatory regions adjacent to the human RHO and PDE6B genes and their subsequent transcription in human ocular tissues. Docking of CRX to the DNA models shows that CRX interacts with the grooves of these sequences, suggesting changes in groove structure could regulate binding. Molecular dynamics simulations of the RHO promoter and enhancer regions show changes in the flexibility and groove width upon epigenetic modification. Models also demonstrate changes in the local dynamics of CRX binding sites within RHO regulatory sequences which may account for the repression of CRX-dependent transcription. Collectively, these data demonstrate epigenetic regulation of CRX binding sites in human retinal tissue and provide insight into the mechanism of this mode of epigenetic regulation to be tested in future experiments.
The limits of de novo DNA motif discovery.

Directory of Open Access Journals (Sweden)

David Simcha

Full Text Available A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of
In silico discovery of transcription regulatory elements in Plasmodium falciparum

Directory of Open Access Journals (Sweden)

Le Roch Karine G

2008-02-01

Full Text Available Abstract Background With the sequence of the Plasmodium falciparum genome and several global mRNA and protein life cycle expression profiling projects now completed, elucidating the underlying networks of transcriptional control important for the progression of the parasite life cycle is highly pertinent to the development of new anti-malarials. To date, relatively little is known regarding the specific mechanisms the parasite employs to regulate gene expression at the mRNA level, with studies of the P. falciparum genome sequence having revealed few cis-regulatory elements and associated transcription factors. Although it is possible the parasite may evoke mechanisms of transcriptional control drastically different from those used by other eukaryotic organisms, the extreme AT-rich nature of P. falciparum intergenic regions (~90% AT presents significant challenges to in silico cis-regulatory element discovery. Results We have developed an algorithm called Gene Enrichment Motif Searching (GEMS that uses a hypergeometric-based scoring function and a position-weight matrix optimization routine to identify with high-confidence regulatory elements in the nucleotide-biased and repeat sequence-rich P. falciparum genome. When applied to promoter regions of genes contained within 21 co-expression gene clusters generated from P. falciparum life cycle microarray data using the semi-supervised clustering algorithm Ontology-based Pattern Identification, GEMS identified 34 putative cis-regulatory elements associated with a variety of parasite processes including sexual development, cell invasion, antigenic variation and protein biosynthesis. Among these candidates were novel motifs, as well as many of the elements for which biological experimental evidence already exists in the Plasmodium literature. To provide evidence for the biological relevance of a cell invasion-related element predicted by GEMS, reporter gene and electrophoretic mobility shift assays
Prediction of tissue-specific cis-regulatory modules using Bayesian networks and regression trees

Directory of Open Access Journals (Sweden)

Chen Xiaoyu

2007-12-01

Full Text Available Abstract Background In vertebrates, a large part of gene transcriptional regulation is operated by cis-regulatory modules. These modules are believed to be regulating much of the tissue-specificity of gene expression. Results We develop a Bayesian network approach for identifying cis-regulatory modules likely to regulate tissue-specific expression. The network integrates predicted transcription factor binding site information, transcription factor expression data, and target gene expression data. At its core is a regression tree modeling the effect of combinations of transcription factors bound to a module. A new unsupervised EM-like algorithm is developed to learn the parameters of the network, including the regression tree structure. Conclusion Our approach is shown to accurately identify known human liver and erythroid-specific modules. When applied to the prediction of tissue-specific modules in 10 different tissues, the network predicts a number of important transcription factor combinations whose concerted binding is associated to specific expression.
Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

Science.gov (United States)

Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

2015-08-15

Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan. © 2015. Published by The Company of Biologists Ltd.
Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data.

Science.gov (United States)

Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J

2015-03-07

Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP
Patterns of cis regulatory variation in diverse human populations.

Directory of Open Access Journals (Sweden)

Barbara E Stranger

Full Text Available The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for
A novel method for predicting activity of cis-regulatory modules, based on a diverse training set.

Science.gov (United States)

Yang, Wei; Sinha, Saurabh

2017-01-01

With the rapid emergence of technologies for locating cis-regulatory modules (CRMs) genome-wide, the next pressing challenge is to assign precise functions to each CRM, i.e. to determine the spatiotemporal domains or cell-types where it drives expression. A popular approach to this task is to model the typical k-mer composition of a set of CRMs known to drive a common expression pattern, and assign that pattern to other CRMs exhibiting a similar k-mer composition. This approach does not rely on prior knowledge of transcription factors relevant to the CRM or their binding motifs, and is thus more widely applicable than motif-based methods for predicting CRM activity, but is also prone to false positive predictions. We present a novel strategy to improve the above-mentioned approach: to predict if a CRM drives a specific gene expression pattern, assess not only how similar the CRM is to other CRMs with similar activity but also to CRMs with distinct activities. We use a state-of-the-art statistical method to quantify a CRM's sequence similarity to many different training sets of CRMs, and employ a classification algorithm to integrate these similarity scores into a single prediction of the CRM's activity. This strategy is shown to significantly improve CRM activity prediction over current approaches. Our implementation of the new method, called IMMBoost, is freely available as source code, at https://github.com/weiyangedward/IMMBoost CONTACT: sinhas@illinois.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Assessment of algorithms for inferring positional weight matrix motifs of transcription factor binding sites using protein binding microarray data.

Directory of Open Access Journals (Sweden)

Yaron Orenstein

Full Text Available The new technology of protein binding microarrays (PBMs allows simultaneous measurement of the binding intensities of a transcription factor to tens of thousands of synthetic double-stranded DNA probes, covering all possible 10-mers. A key computational challenge is inferring the binding motif from these data. We present a systematic comparison of four methods developed specifically for reconstructing a binding site motif represented as a positional weight matrix from PBM data. The reconstructed motifs were evaluated in terms of three criteria: concordance with reference motifs from the literature and ability to predict in vivo and in vitro bindings. The evaluation encompassed over 200 transcription factors and some 300 assays. The results show a tradeoff between how the methods perform according to the different criteria, and a dichotomy of method types. Algorithms that construct motifs with low information content predict PBM probe ranking more faithfully, while methods that produce highly informative motifs match reference motifs better. Interestingly, in predicting high-affinity binding, all methods give far poorer results for in vivo assays compared to in vitro assays.
Using reporter gene assays to identify cis regulatory differences between humans and chimpanzees.

Science.gov (United States)

Chabot, Adrien; Shrit, Ralla A; Blekhman, Ran; Gilad, Yoav

2007-08-01

Most phenotypic differences between human and chimpanzee are likely to result from differences in gene regulation, rather than changes to protein-coding regions. To date, however, only a handful of human-chimpanzee nucleotide differences leading to changes in gene regulation have been identified. To hone in on differences in regulatory elements between human and chimpanzee, we focused on 10 genes that were previously found to be differentially expressed between the two species. We then designed reporter gene assays for the putative human and chimpanzee promoters of the 10 genes. Of seven promoters that we found to be active in human liver cell lines, human and chimpanzee promoters had significantly different activity in four cases, three of which recapitulated the gene expression difference seen in the microarray experiment. For these three genes, we were therefore able to demonstrate that a change in cis influences expression differences between humans and chimpanzees. Moreover, using site-directed mutagenesis on one construct, the promoter for the DDA3 gene, we were able to identify three nucleotides that together lead to a cis regulatory difference between the species. High-throughput application of this approach can provide a map of regulatory element differences between humans and our close evolutionary relatives.
DNA motif alignment by evolving a population of Markov chains.

Science.gov (United States)

Bi, Chengpeng

2009-01-30

Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.
The MTP1 promoters from Arabidopsis halleri reveal cis-regulating elements for the evolution of metal tolerance.

Science.gov (United States)

Fasani, Elisa; DalCorso, Giovanni; Varotto, Claudio; Li, Mingai; Visioli, Giovanna; Mattarozzi, Monica; Furini, Antonella

2017-06-01

In the hyperaccumulator Arabidopsis halleri, the zinc (Zn) vacuolar transporter MTP1 is a key component of hypertolerance. Because protein sequences and functions are highly conserved between A. halleri and Arabidopsis thaliana, Zn tolerance in A. halleri may reflect the constitutively higher MTP1 expression compared with A. thaliana, based on copy number expansion and different cis regulation. Three MTP1 promoters were characterized in A. halleri ecotype I16. The comparison with the A. thaliana MTP1 promoter revealed different expression profiles correlated with specific cis-acting regulatory elements. The MTP1 5' untranslated region, highly conserved among A. thaliana, Arabidopsis lyrata and A. halleri, contains a dimer of MYB-binding motifs in the A. halleri promoters absent in the A. thaliana and A. lyrata sequences. Site-directed mutagenesis of these motifs revealed their role for expression in trichomes. A. thaliana mtp1 transgenic lines expressing AtMTP1 controlled by the native A. halleri promoter were more Zn-tolerant than lines carrying mutations on MYB-binding motifs. Differences in Zn tolerance were associated with different distribution of Zn among plant organs and in trichomes. The different cis-acting elements in the MTP1 promoters of A. halleri, particularly the MYB-binding sites, are probably involved in the evolution of Zn tolerance. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
CisSERS: Customizable In Silico Sequence Evaluation for Restriction Sites.

Science.gov (United States)

Sharpe, Richard M; Koepke, Tyson; Harper, Artemus; Grimes, John; Galli, Marco; Satoh-Cruz, Mio; Kalyanaraman, Ananth; Evans, Katherine; Kramer, David; Dhingra, Amit

2016-01-01

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated to enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERS enable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3'UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERS and results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.
9-cis-retinoic acid represses estrogen-induced expression of the very low density apolipoprotein II gene.

Science.gov (United States)

Schippers, I J; Kloppenburg, M; Snippe, L; Ab, G

1994-11-01

The chicken very low density apolipoprotein II (apoVLDLII) gene is estrogen-inducible and specifically expressed in liver. We examined the possible involvement of the retinoid X receptor (RXR) and its ligand 9-cis-retinoic acid (9-cis-RA) in the activation of the apoVLDLII promoter. We first concentrated on a potential RXR recognition site, which deviates at only one position from a perfect direct A/GGGTCA repeat spaced by one nucleotide (DR-1) and was earlier identified as a common HNF-4/COUP-TF recognition site. However, band shift analysis revealed that this imperfect DR-1 motif does not interact with RXR alpha-homodimers. In accordance with this observation we found that this regulatory element does not mediate transactivation through RXR alpha in the presence of 9-cis-RA. However, our experiments revealed another, unexpected, effect of 9-cis-RA. Instead of stimulating, 9-cis-RA attenuated estrogen-induced expression of transfected estrogen-responsive VLDL-CAT reporter plasmids. This repression appeared to take place through the main estrogen response element (ERE) of the gene. Importantly, 9-cis-RA also strongly repressed the estrogen-induced expression of the endogenous apoVLDLII gene in cultured chicken hepatoma cells.
Inference of Transcription Regulatory Network in Low Phytic Acid Soybean Seeds

Directory of Open Access Journals (Sweden)

Neelam Redekar

2017-11-01

Full Text Available A dominant loss of function mutation in myo-inositol phosphate synthase (MIPS gene and recessive loss of function mutations in two multidrug resistant protein type-ABC transporter genes not only reduce the seed phytic acid levels in soybean, but also affect the pathways associated with seed development, ultimately resulting in low emergence. To understand the regulatory mechanisms and identify key genes that intervene in the seed development process in low phytic acid crops, we performed computational inference of gene regulatory networks in low and normal phytic acid soybeans using a time course transcriptomic data and multiple network inference algorithms. We identified a set of putative candidate transcription factors and their regulatory interactions with genes that have functions in myo-inositol biosynthesis, auxin-ABA signaling, and seed dormancy. We evaluated the performance of our unsupervised network inference method by comparing the predicted regulatory network with published regulatory interactions in Arabidopsis. Some contrasting regulatory interactions were observed in low phytic acid mutants compared to non-mutant lines. These findings provide important hypotheses on expression regulation of myo-inositol metabolism and phytohormone signaling in developing low phytic acid soybeans. The computational pipeline used for unsupervised network learning in this study is provided as open source software and is freely available at https://lilabatvt.github.io/LPANetwork/.

Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

International Nuclear Information System (INIS)

Westberg, Johan A.; Jiang, Ji; Andersson, Leif C.

2011-01-01

Highlights: → Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. → Central iron atom of heme and cysteine-114 of STC1 are essential for binding. → STC1 binds Fe 2+ and Fe 3+ heme. → STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys 114 as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H 2 O 2 induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.
CisSERS: Customizable In Silico Sequence Evaluation for Restriction Sites.

Directory of Open Access Journals (Sweden)

Richard M Sharpe

Full Text Available High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated to enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERS enable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3'UTR sequencing, and cleaved amplified polymorphic sequence (CAPS molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERS and results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.
Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

Energy Technology Data Exchange (ETDEWEB)

Westberg, Johan A., E-mail: johan.westberg@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Jiang, Ji, E-mail: ji.jiang@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Andersson, Leif C., E-mail: leif.andersson@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland)

2011-06-03

Highlights: {yields} Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. {yields} Central iron atom of heme and cysteine-114 of STC1 are essential for binding. {yields} STC1 binds Fe{sup 2+} and Fe{sup 3+} heme. {yields} STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys{sup 114} as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H{sub 2}O{sub 2} induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.
Direct AUC optimization of regulatory motifs.

Science.gov (United States)

Zhu, Lin; Zhang, Hong-Bo; Huang, De-Shuang

2017-07-15

The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. We propose a novel algorithm called CDAUC for optimizing DML-learned motifs based on the area under the receiver-operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub-problem is a piece-wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high-throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities. CDAUC is available at: https://drive.google.com/drive/folders/0BxOW5MtIZbJjNFpCeHlBVWJHeW8 . dshuang@tongji.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
An algebra-based method for inferring gene regulatory networks.

Science.gov (United States)

Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

2014-03-26

The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
KIRMES: kernel-based identification of regulatory modules in euchromatic sequences.

Science.gov (United States)

Schultheiss, Sebastian J; Busch, Wolfgang; Lohmann, Jan U; Kohlbacher, Oliver; Rätsch, Gunnar

2009-08-15

Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules. We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets. The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at http://www.fml.mpg.de/raetsch/suppl/kirmes/.
Quantitative inference of dynamic regulatory pathways via microarray data

Directory of Open Access Journals (Sweden)

Chen Bor-Sen

2005-03-01

Full Text Available Abstract Background The cellular signaling pathway (network is one of the main topics of organismic investigations. The intracellular interactions between genes in a signaling pathway are considered as the foundation of functional genomics. Thus, what genes and how much they influence each other through transcriptional binding or physical interactions are essential problems. Under the synchronous measures of gene expression via a microarray chip, an amount of dynamic information is embedded and remains to be discovered. Using a systematically dynamic modeling approach, we explore the causal relationship among genes in cellular signaling pathways from the system biology approach. Results In this study, a second-order dynamic model is developed to describe the regulatory mechanism of a target gene from the upstream causality point of view. From the expression profile and dynamic model of a target gene, we can estimate its upstream regulatory function. According to this upstream regulatory function, we would deduce the upstream regulatory genes with their regulatory abilities and activation delays, and then link up a regulatory pathway. Iteratively, these regulatory genes are considered as target genes to trace back their upstream regulatory genes. Then we could construct the regulatory pathway (or network to the genome wide. In short, we can infer the genetic regulatory pathways from gene-expression profiles quantitatively, which can confirm some doubted paths or seek some unknown paths in a regulatory pathway (network. Finally, the proposed approach is validated by randomly reshuffling the time order of microarray data. Conclusion We focus our algorithm on the inference of regulatory abilities of the identified causal genes, and how much delay before they regulate the downstream genes. With this information, a regulatory pathway would be built up using microarray data. In the present study, two signaling pathways, i.e. circadian regulatory
Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

Science.gov (United States)

Jia, Bin; Wang, Xiaodong

2013-12-17

: The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.
Predictive minimum description length principle approach to inferring gene regulatory networks.

Science.gov (United States)

Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping

2011-01-01

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells.

Directory of Open Access Journals (Sweden)

Marica Grskovic

2007-08-01

Full Text Available Understanding the transcriptional regulation of pluripotent cells is of fundamental interest and will greatly inform efforts aimed at directing differentiation of embryonic stem (ES cells or reprogramming somatic cells. We first analyzed the transcriptional profiles of mouse ES cells and primordial germ cells and identified genes upregulated in pluripotent cells both in vitro and in vivo. These genes are enriched for roles in transcription, chromatin remodeling, cell cycle, and DNA repair. We developed a novel computational algorithm, CompMoby, which combines analyses of sequences both aligned and non-aligned between different genomes with a probabilistic segmentation model to systematically predict short DNA motifs that regulate gene expression. CompMoby was used to identify conserved overrepresented motifs in genes upregulated in pluripotent cells. We show that the motifs are preferentially active in undifferentiated mouse ES and embryonic germ cells in a sequence-specific manner, and that they can act as enhancers in the context of an endogenous promoter. Importantly, the activity of the motifs is conserved in human ES cells. We further show that the transcription factor NF-Y specifically binds to one of the motifs, is differentially expressed during ES cell differentiation, and is required for ES cell proliferation. This study provides novel insights into the transcriptional regulatory networks of pluripotent cells. Our results suggest that this systematic approach can be broadly applied to understanding transcriptional networks in mammalian species.
Do motifs reflect evolved function?--No convergent evolution of genetic regulatory network subgraph topologies.

Science.gov (United States)

Knabe, Johannes F; Nehaniv, Chrystopher L; Schilstra, Maria J

2008-01-01

Methods that analyse the topological structure of networks have recently become quite popular. Whether motifs (subgraph patterns that occur more often than in randomized networks) have specific functions as elementary computational circuits has been cause for debate. As the question is difficult to resolve with currently available biological data, we approach the issue using networks that abstractly model natural genetic regulatory networks (GRNs) which are evolved to show dynamical behaviors. Specifically one group of networks was evolved to be capable of exhibiting two different behaviors ("differentiation") in contrast to a group with a single target behavior. In both groups we find motif distribution differences within the groups to be larger than differences between them, indicating that evolutionary niches (target functions) do not necessarily mold network structure uniquely. These results show that variability operators can have a stronger influence on network topologies than selection pressures, especially when many topologies can create similar dynamics. Moreover, analysis of motif functional relevance by lesioning did not suggest that motifs were of greater importance to the functioning of the network than arbitrary subgraph patterns. Only when drastically restricting network size, so that one motif corresponds to a whole functionally evolved network, was preference for particular connection patterns found. This suggests that in non-restricted, bigger networks, entanglement with the rest of the network hinders topological subgraph analysis.
Inferring regulatory networks from expression data using tree-based methods.

Directory of Open Access Journals (Sweden)

Vân Anh Huynh-Thu

2010-09-01

Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
Network-directed cis-mediator analysis of normal prostate tissue expression profiles reveals downstream regulatory associations of prostate cancer susceptibility loci.

Science.gov (United States)

Larson, Nicholas B; McDonnell, Shannon K; Fogarty, Zach; Larson, Melissa C; Cheville, John; Riska, Shaun; Baheti, Saurabh; Weber, Alexandra M; Nair, Asha A; Wang, Liang; O'Brien, Daniel; Davila, Jaime; Schaid, Daniel J; Thibodeau, Stephen N

2017-10-17

Large-scale genome-wide association studies have identified multiple single-nucleotide polymorphisms associated with risk of prostate cancer. Many of these genetic variants are presumed to be regulatory in nature; however, follow-up expression quantitative trait loci (eQTL) association studies have to-date been restricted largely to cis -acting associations due to study limitations. While trans -eQTL scans suffer from high testing dimensionality, recent evidence indicates most trans -eQTL associations are mediated by cis -regulated genes, such as transcription factors. Leveraging a data-driven gene co-expression network, we conducted a comprehensive cis -mediator analysis using RNA-Seq data from 471 normal prostate tissue samples to identify downstream regulatory associations of previously identified prostate cancer risk variants. We discovered multiple trans -eQTL associations that were significantly mediated by cis -regulated transcripts, four of which involved risk locus 17q12, proximal transcription factor HNF1B , and target trans -genes with known HNF response elements ( MIA2 , SRC , SEMA6A , KIF12 ). We additionally identified evidence of cis -acting down-regulation of MSMB via rs10993994 corresponding to reduced co-expression of NDRG1 . The majority of these cis -mediator relationships demonstrated trans -eQTL replicability in 87 prostate tissue samples from the Gene-Tissue Expression Project. These findings provide further biological context to known risk loci and outline new hypotheses for investigation into the etiology of prostate cancer.
RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

Energy Technology Data Exchange (ETDEWEB)

Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

2010-05-26

RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
Inferring the conservative causal core of gene regulatory networks

Directory of Open Access Journals (Sweden)

Emmert-Streib Frank

2010-09-01

Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Inferring the conservative causal core of gene regulatory networks.

Science.gov (United States)

Altay, Gökmen; Emmert-Streib, Frank

2010-09-28

Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Comparative study of discretization methods of microarray data for inferring transcriptional regulatory networks

Directory of Open Access Journals (Sweden)

Ji Wei

2010-10-01

Full Text Available Abstract Background Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discretization has been conducted, in the context of gene regulatory network inference from time series gene expression data. Results In this study, we propose a new discretization method "bikmeans", and compare its performance with four other widely-used discretization methods using different datasets, modeling algorithms and number of intervals. Sensitivities, specificities and total accuracies were calculated and statistical analysis was carried out. Bikmeans method always gave high total accuracies. Conclusions Our results indicate that proper discretization methods can consistently improve gene regulatory network inference independent of network modeling algorithms and datasets. Our new method, bikmeans, resulted in significant better total accuracies than other methods.
Multiple cis-regulatory elements are involved in the complex regulation of the sieve element-specific MtSEO-F1 promoter from Medicago truncatula.

Science.gov (United States)

Bucsenez, M; Rüping, B; Behrens, S; Twyman, R M; Noll, G A; Prüfer, D

2012-09-01

The sieve element occlusion (SEO) gene family includes several members that are expressed specifically in immature sieve elements (SEs) in the developing phloem of dicotyledonous plants. To determine how this restricted expression profile is achieved, we analysed the SE-specific Medicago truncatula SEO-F1 promoter (PMtSEO-F1) by constructing deletion, substitution and hybrid constructs and testing them in transgenic tobacco plants using green fluorescent protein as a reporter. This revealed four promoter regions, each containing cis-regulatory elements that activate transcription in SEs. One of these segments also contained sufficient information to suppress PMtSEO-F1 transcription in the phloem companion cells (CCs). Subsequent in silico analysis revealed several candidate cis-regulatory elements that PMtSEO-F1 shares with other SEO promoters. These putative sieve element boxes (PSE boxes) are promising candidates for cis-regulatory elements controlling the SE-specific expression of PMtSEO-F1. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.
Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva

Directory of Open Access Journals (Sweden)

Guo Xiang

2008-12-01

regulatory motifs in other species. These results suggest that these two motifs are likely to represent transcription factor binding sites in Theileria. Conclusion Theileria genomes are highly compact, with selection seemingly favoring short introns and intergenic regions. Three over-represented sequence motifs were independently identified in intergenic regions of both Theileria species, and the evidence suggests that at least two of them play a role in transcriptional control in T. parva. These are prime candidates for experimental validation of transcription factor binding sites in this single-celled eukaryotic parasite. Sequences similar to two of these Theileria motifs are conserved in Plasmodium hinting at the possibility of common regulatory machinery across the phylum Apicomplexa.
Genetic interaction motif finding by expectation maximization – a novel statistical model for inferring gene modules from synthetic lethality

Directory of Open Access Journals (Sweden)

Ye Ping

2005-12-01

Full Text Available Abstract Background Synthetic lethality experiments identify pairs of genes with complementary function. More direct functional associations (for example greater probability of membership in a single protein complex may be inferred between genes that share synthetic lethal interaction partners than genes that are directly synthetic lethal. Probabilistic algorithms that identify gene modules based on motif discovery are highly appropriate for the analysis of synthetic lethal genetic interaction data and have great potential in integrative analysis of heterogeneous datasets. Results We have developed Genetic Interaction Motif Finding (GIMF, an algorithm for unsupervised motif discovery from synthetic lethal interaction data. Interaction motifs are characterized by position weight matrices and optimized through expectation maximization. Given a seed gene, GIMF performs a nonlinear transform on the input genetic interaction data and automatically assigns genes to the motif or non-motif category. We demonstrate the capacity to extract known and novel pathways for Saccharomyces cerevisiae (budding yeast. Annotations suggested for several uncharacterized genes are supported by recent experimental evidence. GIMF is efficient in computation, requires no training and automatically down-weights promiscuous genes with high degrees. Conclusion GIMF effectively identifies pathways from synthetic lethality data with several unique features. It is mostly suitable for building gene modules around seed genes. Optimal choice of one single model parameter allows construction of gene networks with different levels of confidence. The impact of hub genes the generic probabilistic framework of GIMF may be used to group other types of biological entities such as proteins based on stochastic motifs. Analysis of the strongest motifs discovered by the algorithm indicates that synthetic lethal interactions are depleted between genes within a motif, suggesting that synthetic

Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

Science.gov (United States)

Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

2005-07-15

A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.
Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

Directory of Open Access Journals (Sweden)

Fei Xiao

Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Motif enrichment tool.

Science.gov (United States)

Blatti, Charles; Sinha, Saurabh

2014-07-01

The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
A New Approach to Sequence Analysis Exemplified by Identification of cis-Elements in Abscisic Acid Inducible Promoters

DEFF Research Database (Denmark)

Busk, Peter Kamp; Hallin, Peter Fischer; Salomon, Jesper

-regulatory elements. We have developed a method for identifying short, conserved motifs in biological sequences such as proteins, DNA and RNA5. This method was used for analysis of approximately 2000 Arabidopsis thaliana promoters that have been shown by DNA array analysis to be induced by abscisic acid6....... These promoters were compared to 28000 promoters that are not induced by abscisic acid. The analysis identified previously described ABA-inducible promoter elements such as ABRE, CE3 and CRT1 but also new cis-elements were found. Furthermore, the list of DNA elements could be used to predict ABA...
Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.

Science.gov (United States)

Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S

2014-01-01

Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Direct activation of a notochord cis-regulatory module by Brachyury and FoxA in the ascidian Ciona intestinalis.

Science.gov (United States)

Passamaneck, Yale J; Katikala, Lavanya; Perrone, Lorena; Dunn, Matthew P; Oda-Ishii, Izumi; Di Gregorio, Anna

2009-11-01

The notochord is a defining feature of the chordate body plan. Experiments in ascidian, frog and mouse embryos have shown that co-expression of Brachyury and FoxA class transcription factors is required for notochord development. However, studies on the cis-regulatory sequences mediating the synergistic effects of these transcription factors are complicated by the limited knowledge of notochord genes and cis-regulatory modules (CRMs) that are directly targeted by both. We have identified an easily testable model for such investigations in a 155-bp notochord-specific CRM from the ascidian Ciona intestinalis. This CRM contains functional binding sites for both Ciona Brachyury (Ci-Bra) and FoxA (Ci-FoxA-a). By combining point mutation analysis and misexpression experiments, we demonstrate that binding of both transcription factors to this CRM is necessary and sufficient to activate transcription. To gain insights into the cis-regulatory criteria controlling its activity, we investigated the organization of the transcription factor binding sites within the 155-bp CRM. The 155-bp sequence contains two Ci-Bra binding sites with identical core sequences but opposite orientations, only one of which is required for enhancer activity. Changes in both orientation and spacing of these sites substantially affect the activity of the CRM, as clusters of identical sites found in the Ciona genome with different arrangements are unable to activate transcription in notochord cells. This work presents the first evidence of a synergistic interaction between Brachyury and FoxA in the activation of an individual notochord CRM, and highlights the importance of transcription factor binding site arrangement for its function.
Characterization of a putative cis-regulatory element that controls transcriptional activity of the pig uroplakin II gene promoter

International Nuclear Information System (INIS)

Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi; Cho, Ssang-Goo; Park, Chankyu; Oh, Jae-Wook; Song, Hyuk; Kim, Jae-Hwan; Kim, Jin-Hoi

2011-01-01

Highlights: → The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. → The core promoter was located in the 5F-1. → Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. → These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, but little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.
MIRA: An R package for DNA methylation-based inference of regulatory activity.

Science.gov (United States)

Lawson, John T; Tomazou, Eleni M; Bock, Christoph; Sheffield, Nathan C

2018-03-01

DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for independent region sets with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for each region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of open chromatin and protein binding regions to be leveraged for novel insight into the regulatory state of DNA methylation datasets. R package available on Bioconductor: http://bioconductor.org/packages/release/bioc/html/MIRA.html. nsheffield@virginia.edu.
Identifying TF-MiRNA Regulatory Relationships Using Multiple Features.

Directory of Open Access Journals (Sweden)

Mingyu Shao

Full Text Available MicroRNAs are known to play important roles in the transcriptional and post-transcriptional regulation of gene expression. While intensive research has been conducted to identify miRNAs and their target genes in various genomes, there is only limited knowledge about how microRNAs are regulated. In this study, we construct a pipeline that can infer the regulatory relationships between transcription factors and microRNAs from ChIP-Seq data with high confidence. In particular, after identifying candidate peaks from ChIP-Seq data, we formulate the inference as a PU learning (learning from only positive and unlabeled examples problem. Multiple features including the statistical significance of the peaks, the location of the peaks, the transcription factor binding site motifs, and the evolutionary conservation are derived from peaks for training and prediction. To further improve the accuracy of our inference, we also apply a mean reciprocal rank (MRR-based method to the candidate peaks. We apply our pipeline to infer TF-miRNA regulatory relationships in mouse embryonic stem cells. The experimental results show that our approach provides very specific findings of TF-miRNA regulatory relationships.
Genetic mapping uncovers cis-regulatory landscape of RNA editing.

Science.gov (United States)

Ramaswami, Gokul; Deng, Patricia; Zhang, Rui; Anna Carbone, Mary; Mackay, Trudy F C; Li, Jin Billy

2015-09-16

Adenosine-to-inosine (A-to-I) RNA editing, catalysed by ADAR enzymes conserved in metazoans, plays an important role in neurological functions. Although the fine-tuning mechanism provided by A-to-I RNA editing is important, the underlying rules governing ADAR substrate recognition are not well understood. We apply a quantitative trait loci (QTL) mapping approach to identify genetic variants associated with variability in RNA editing. With very accurate measurement of RNA editing levels at 789 sites in 131 Drosophila melanogaster strains, here we identify 545 editing QTLs (edQTLs) associated with differences in RNA editing. We demonstrate that many edQTLs can act through changes in the local secondary structure for edited dsRNAs. Furthermore, we find that edQTLs located outside of the edited dsRNA duplex are enriched in secondary structure, suggesting that distal dsRNA structure beyond the editing site duplex affects RNA editing efficiency. Our work will facilitate the understanding of the cis-regulatory code of RNA editing.
Conserved cis-regulatory regions in a large genomic landscape control SHH and BMP-regulated Gremlin1 expression in mouse limb buds

Directory of Open Access Journals (Sweden)

Zuniga Aimée

2012-08-01

Full Text Available Abstract Background Mouse limb bud is a prime model to study the regulatory interactions that control vertebrate organogenesis. Major aspects of limb bud development are controlled by feedback loops that define a self-regulatory signalling system. The SHH/GREM1/AER-FGF feedback loop forms the core of this signalling system that operates between the posterior mesenchymal organiser and the ectodermal signalling centre. The BMP antagonist Gremlin1 (GREM1 is a critical node in this system, whose dynamic expression is controlled by BMP, SHH, and FGF signalling and key to normal progression of limb bud development. Previous analysis identified a distant cis-regulatory landscape within the neighbouring Formin1 (Fmn1 locus that is required for Grem1 expression, reminiscent of the genomic landscapes controlling HoxD and Shh expression in limb buds. Results Three highly conserved regions (HMCO1-3 were identified within the previously defined critical genomic region and tested for their ability to regulate Grem1 expression in mouse limb buds. Using a combination of BAC and conventional transgenic approaches, a 9 kb region located ~70 kb downstream of the Grem1 transcription unit was identified. This region, termed Grem1 Regulatory Sequence 1 (GRS1, is able to recapitulate major aspects of Grem1 expression, as it drives expression of a LacZ reporter into the posterior and, to a lesser extent, in the distal-anterior mesenchyme. Crossing the GRS1 transgene into embryos with alterations in the SHH and BMP pathways established that GRS1 depends on SHH and is modulated by BMP signalling, i.e. integrates inputs from these pathways. Chromatin immunoprecipitation revealed interaction of endogenous GLI3 proteins with the core cis-regulatory elements in the GRS1 region. As GLI3 is a mediator of SHH signal transduction, these results indicated that SHH directly controls Grem1 expression through the GRS1 region. Finally, all cis-regulatory regions within the Grem1
Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

Science.gov (United States)

Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

2014-12-01

Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints

Directory of Open Access Journals (Sweden)

Xu Fuyu

2012-09-01

Full Text Available Abstract Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa formed ten clusters of orthologous groups (COG with genes from the monocot sorghum (Sorghum bicolor and dicot Arabidopsis (Arabidopsis thaliana. The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in
Hydrogenation of fluoroarenes: Direct access to all-cis-(multi)fluorinated cycloalkanes.

Science.gov (United States)

Wiesenfeldt, Mario P; Nairoukh, Zackaria; Li, Wei; Glorius, Frank

2017-09-01

All-c is -multifluorinated cycloalkanes exhibit intriguing electronic properties. In particular, they display extremely high dipole moments perpendicular to the aliphatic ring, making them highly desired motifs in material science. Very few such motifs have been prepared, as their syntheses require multistep sequences from diastereoselectively prefunctionalized precursors. Herein we report a synthetic strategy to access these valuable materials via the rhodium-cyclic (alkyl)(amino)carbene (CAAC)-catalyzed hydrogenation of readily available fluorinated arenes in hexane. This route enables the scalable single-step preparation of an abundance of multisubstituted and multifluorinated cycloalkanes, including all- cis -1,2,3,4,5,6-hexafluorocyclohexane as well as cis-configured fluorinated aliphatic heterocycles. Copyright © 2017, American Association for the Advancement of Science.
Prenatal exposure of mice to diethylstilbestrol disrupts T-cell differentiation by regulating Fas/Fas ligand expression through estrogen receptor element and nuclear factor-κB motifs.

Science.gov (United States)

Singh, Narendra P; Singh, Udai P; Nagarkatti, Prakash S; Nagarkatti, Mitzi

2012-11-01

Prenatal exposure to diethylstilbestrol (DES) is known to cause altered immune functions and increased susceptibility to autoimmune disease in humans. In the current study, we investigated the effect of prenatal exposure to DES on thymocyte differentiation involving apoptotic pathways. Prenatal DES exposure caused thymic atrophy, apoptosis, and up-regulation of Fas and Fas ligand (FasL) expression in thymocytes. To examine the mechanism underlying DES-mediated regulation of Fas and FasL, we performed luciferase assays using T cells transfected with luciferase reporter constructs containing full-length Fas or FasL promoters. There was significant luciferase induction in the presence of Fas or FasL promoters after DES exposure. Further analysis demonstrated the presence of several cis-regulatory motifs on both Fas and FasL promoters. When DES-induced transcription factors were analyzed, estrogen receptor element (ERE), nuclear factor κB (NF-κB), nuclear factor of activated T cells (NF-AT), and activator protein-1 motifs on the Fas promoter, as well as ERE, NF-κB, and NF-AT motifs on the FasL promoter, showed binding affinity with the transcription factors. Electrophoretic mobility-shift assays were performed to verify the binding affinity of cis-regulatory motifs of Fas or FasL promoters with transcription factors. There was shift in mobility of probes (ERE or NF-κB2) of both Fas and FasL in the presence of nuclear proteins from DES-treated cells, and the shift was specific to DES because these probes failed to shift their mobility in the presence of nuclear proteins from vehicle-treated cells. Together, the current study demonstrates that prenatal exposure to DES triggers significant alterations in apoptotic molecules expressed on thymocytes, which may affect T-cell differentiation and cause long-term effects on the immune functions.
Alignment and prediction of cis-regulatory modules based on a probabilistic model of evolution.

Directory of Open Access Journals (Sweden)

Xin He

2009-03-01

Full Text Available Cross-species comparison has emerged as a powerful paradigm for predicting cis-regulatory modules (CRMs and understanding their evolution. The comparison requires reliable sequence alignment, which remains a challenging task for less conserved noncoding sequences. Furthermore, the existing models of DNA sequence evolution generally do not explicitly treat the special properties of CRM sequences. To address these limitations, we propose a model of CRM evolution that captures different modes of evolution of functional transcription factor binding sites (TFBSs and the background sequences. A particularly novel aspect of our work is a probabilistic model of gains and losses of TFBSs, a process being recognized as an important part of regulatory sequence evolution. We present a computational framework that uses this model to solve the problems of CRM alignment and prediction. Our alignment method is similar to existing methods of statistical alignment but uses the conserved binding sites to improve alignment. Our CRM prediction method deals with the inherent uncertainties of binding site annotations and sequence alignment in a probabilistic framework. In simulated as well as real data, we demonstrate that our program is able to improve both alignment and prediction of CRM sequences over several state-of-the-art methods. Finally, we used alignments produced by our program to study binding site conservation in genome-wide binding data of key transcription factors in the Drosophila blastoderm, with two intriguing results: (i the factor-bound sequences are under strong evolutionary constraints even if their neighboring genes are not expressed in the blastoderm and (ii binding sites in distal bound sequences (relative to transcription start sites tend to be more conserved than those in proximal regions. Our approach is implemented as software, EMMA (Evolutionary Model-based cis-regulatory Module Analysis, ready to be applied in a broad biological context.
BET Bromodomain Inhibition Releases the Mediator Complex from Select cis-Regulatory Elements.

Science.gov (United States)

Bhagwat, Anand S; Roe, Jae-Seok; Mok, Beverly Y L; Hohmann, Anja F; Shi, Junwei; Vakoc, Christopher R

2016-04-19

The bromodomain and extraterminal (BET) protein BRD4 can physically interact with the Mediator complex, but the relevance of this association to the therapeutic effects of BET inhibitors in cancer is unclear. Here, we show that BET inhibition causes a rapid release of Mediator from a subset of cis-regulatory elements in the genome of acute myeloid leukemia (AML) cells. These sites of Mediator eviction were highly correlated with transcriptional suppression of neighboring genes, which are enriched for targets of the transcription factor MYB and for functions related to leukemogenesis. A shRNA screen of Mediator in AML cells identified the MED12, MED13, MED23, and MED24 subunits as performing a similar regulatory function to BRD4 in this context, including a shared role in sustaining a block in myeloid maturation. These findings suggest that the interaction between BRD4 and Mediator has functional importance for gene-specific transcriptional activation and for AML maintenance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
[Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

Science.gov (United States)

Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

2013-10-01

Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.
Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks.

Science.gov (United States)

Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J

2013-01-01

We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.
A generalized allosteric mechanism for cis-regulated cyclic nucleotide binding domains.

Directory of Open Access Journals (Sweden)

Alexandr P Kornev

2008-04-01

Full Text Available Cyclic nucleotides (cAMP and cGMP regulate multiple intracellular processes and are thus of a great general interest for molecular and structural biologists. To study the allosteric mechanism of different cyclic nucleotide binding (CNB domains, we compared cAMP-bound and cAMP-free structures (PKA, Epac, and two ionic channels using a new bioinformatics method: local spatial pattern alignment. Our analysis highlights four major conserved structural motifs: 1 the phosphate binding cassette (PBC, which binds the cAMP ribose-phosphate, 2 the "hinge," a flexible helix, which contacts the PBC, 3 the beta(2,3 loop, which provides precise positioning of an invariant arginine from the PBC, and 4 a conserved structural element consisting of an N-terminal helix, an eight residue loop and the A-helix (N3A-motif. The PBC and the hinge were included in the previously reported allosteric model, whereas the definition of the beta(2,3 loop and the N3A-motif as conserved elements is novel. The N3A-motif is found in all cis-regulated CNB domains, and we present a model for an allosteric mechanism in these domains. Catabolite gene activator protein (CAP represents a trans-regulated CNB domain family: it does not contain the N3A-motif, and its long range allosteric interactions are substantially different from the cis-regulated CNB domains.

Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.

Science.gov (United States)

Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A

2017-08-07

High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

Directory of Open Access Journals (Sweden)

Xiaobo Guo

Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Multiple Functional Variants in cis Modulate PDYN Expression.

Science.gov (United States)

Babbitt, Courtney C; Silverman, Jesse S; Haygood, Ralph; Reininga, Jennifer M; Rockman, Matthew V; Wray, Gregory A

2010-02-01

Understanding genetic variation and its functional consequences within cis-regulatory regions remains an important challenge in human genetics and evolution. Here, we present a fine-scale functional analysis of segregating variation within the cis-regulatory region of prodynorphin, a gene that encodes an endogenous opioid precursor with roles in cognition and disease. In order to characterize the functional consequences of segregating variation in cis in a region under balancing selection in different human populations, we examined associations between specific polymorphisms and gene expression in vivo and in vitro. We identified five polymorphisms within the 5' flanking region that affect transcript abundance: a 68-bp repeat recognized in prior studies, as well as two microsatellites and two single nucleotide polymorphisms not previously implicated as functional variants. The impact of these variants on transcription differs by brain region, sex, and cell type, implying interactions between cis genotype and the differentiated state of cells. The effects of individual variants on expression level are not additive in some combinations, implying epistatic interactions between nearby variants. These data reveal an unexpectedly complex relationship between segregating genetic variation and its expression-trait consequences and highlights the importance of close functional scrutiny of natural genetic variation within even relatively well-studied cis-regulatory regions.
I-motif DNA structures are formed in the nuclei of human cells

Science.gov (United States)

Zeraati, Mahdi; Langley, David B.; Schofield, Peter; Moye, Aaron L.; Rouet, Romain; Hughes, William E.; Bryan, Tracy M.; Dinger, Marcel E.; Christ, Daniel

2018-06-01

Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.
Changes in Cis-regulatory Elements during Morphological Evolution

Directory of Open Access Journals (Sweden)

Yu-Lee Paul

2012-10-01

Full Text Available How have animals evolved new body designs (morphological evolution? This requires explanations both for simple morphological changes, such as differences in pigmentation and hair patterns between different Drosophila populations and species, and also for more complex changes, such as differences in the forelimbs of mice and bats, and the necks of amphibians and reptiles. The genetic changes and pathways involved in these evolutionary steps require identification. Many, though not all, of these events occur by changes in cis-regulatory (enhancer elements within developmental genes. Enhancers are modular, each affecting expression in only one or a few tissues. Therefore it is possible to add, remove or alter an enhancer without producing changes in multiple tissues, and thereby avoid widespread (pleiotropic deleterious effects. Ideally, for a given step in morphological evolution it is necessary to identify (i the change in phenotype, (ii the changes in gene expression, (iii the DNA region, enhancer or otherwise, affected, (iv the mutation involved, (v the nature of the transcription or other factors that bind to this site. In practice these data are incomplete for most of the published studies upon morphological evolution. Here, the investigations are categorized according to how far these analyses have proceeded.
Characterization of Cer-1 cis-regulatory region during early Xenopus development.

Science.gov (United States)

Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

2011-05-01

Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors

Directory of Open Access Journals (Sweden)

Frank Emmert-Streib

2013-02-01

Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

Directory of Open Access Journals (Sweden)

Joeri Ruyssinck

Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
A HLA class I cis-regulatory element whose activity can be modulated by hormones.

Science.gov (United States)

Sim, B C; Hui, K M

1994-12-01

To elucidate the basis of the down-regulation in major histocompatibility complex (MHC) class I gene expression and to identify possible DNA-binding regulatory elements that have the potential to interact with class I MHC genes, we have studied the transcriptional regulation of class I HLA genes in human breast carcinoma cells. A 9 base pair (bp) negative cis-regulatory element (NRE) has been identified using band-shift assays employing DNA sequences derived from the 5'-flanking region of HLA class I genes. This 9-bp element, GTCATGGCG, located within exon I of the HLA class I gene, can potently inhibit the expression of a heterologous thymidine kinase (TK) gene promoter and the HLA enhancer element. Furthermore, this regulatory element can exert its suppressive function in either the sense or anti-sense orientation. More interestingly, NRE can suppress dexamethasone-mediated gene activation in the context of the reported glucocorticoid-responsive element (GRE) in MCF-7 cells but has no influence on the estrogen-mediated transcriptional activation of MCF-7 cells in the context of the reported estrogen-responsive element (ERE). Furthermore, the presence of such a regulatory element within the HLA class I gene whose activity can be modulated by hormones correlates well with our observation that the level of HLA class I gene expression can be down-regulated by hormones in human breast carcinoma cells. Such interactions between negative regulatory elements and specific hormone trans-activators are novel and suggest a versatile form of transcriptional control.
Comparison of evolutionary algorithms in gene regulatory network model inference.

LENUS (Irish Health Repository)

2010-01-01

ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.
An in vivo cis-regulatory screen at the type 2 diabetes associated TCF7L2 locus identifies multiple tissue-specific enhancers.

Directory of Open Access Journals (Sweden)

Daniel Savic

Full Text Available Genome-wide association studies (GWAS have repeatedly shown an association between non-coding variants in the TCF7L2 locus and risk for type 2 diabetes (T2D, implicating a role for cis-regulatory variation within this locus in disease etiology. Supporting this hypothesis, we previously localized complex regulatory activity to the TCF7L2 T2D-associated interval using an in vivo bacterial artificial chromosome (BAC enhancer-trapping reporter strategy. To follow-up on this broad initial survey of the TCF7L2 regulatory landscape, we performed a fine-mapping enhancer scan using in vivo mouse transgenic reporter assays. We functionally interrogated approximately 50% of the sequences within the T2D-associated interval, utilizing sequence conservation within this 92-kb interval to determine the regulatory potential of all evolutionary conserved sequences that exhibited conservation to the non-eutherian mammal opossum. Included in this study was a detailed functional interrogation of sequences spanning both protective and risk alleles of single nucleotide polymorphism (SNP rs7903146, which has exhibited allele-specific enhancer function in pancreatic beta cells. Using these assays, we identified nine segments regulating various aspects of the TCF7L2 expression profile and that constitute nearly 70% of the sequences tested. These results highlight the regulatory complexity of this interval and support the notion that a TCF7L2 cis-regulatory disruption leads to T2D predisposition.
PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences

OpenAIRE

Lescot, Magali; Déhais, Patrice; Thijs, Gert; Marchal, Kathleen; Moreau, Yves; Van de Peer, Yves; Rouzé, Pierre; Rombauts, Stephane

2002-01-01

PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus sequences and individual sites on particular promoter sequences. Links to the EMBL, TRANSFAC and MEDLINE databases are provided when available. Data about the transcription sites are extracted mainly from the literature, supplemented with an increasing number of in silico predicted data. Apart from a general description for specific t...
A canonical correlation analysis-based dynamic bayesian network prior to infer gene regulatory networks from multiple types of biological data.

Science.gov (United States)

Baur, Brittany; Bozdag, Serdar

2015-04-01

One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

Energy Technology Data Exchange (ETDEWEB)

Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)

2014-05-20

Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

International Nuclear Information System (INIS)

Santra, Tapesh

2014-01-01

Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
GRN2SBML: automated encoding and annotation of inferred gene regulatory networks complying with SBML.

Science.gov (United States)

Vlaic, Sebastian; Hoffmann, Bianca; Kupfer, Peter; Weber, Michael; Dräger, Andreas

2013-09-01

GRN2SBML automatically encodes gene regulatory networks derived from several inference tools in systems biology markup language. Providing a graphical user interface, the networks can be annotated via the simple object access protocol (SOAP)-based application programming interface of BioMart Central Portal and minimum information required in the annotation of models registry. Additionally, we provide an R-package, which processes the output of supported inference algorithms and automatically passes all required parameters to GRN2SBML. Therefore, GRN2SBML closes a gap in the processing pipeline between the inference of gene regulatory networks and their subsequent analysis, visualization and storage. GRN2SBML is freely available under the GNU Public License version 3 and can be downloaded from http://www.hki-jena.de/index.php/0/2/490. General information on GRN2SBML, examples and tutorials are available at the tool's web page.
GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.

Science.gov (United States)

Schaffter, Thomas; Marbach, Daniel; Floreano, Dario

2011-08-15

Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. dario.floreano@epfl.ch.
A Meta-Analysis of Multiple Matched Copy Number and Transcriptomics Data Sets for Inferring Gene Regulatory Relationships

Science.gov (United States)

Newton, Richard; Wernisch, Lorenz

2014-01-01

Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments. PMID:25148247
PAA, WSH, and CIS Overview Self-Study #47656

Energy Technology Data Exchange (ETDEWEB)

Schroeder, Rachel Anne [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2017-09-14

This course presents an overview of the Department of Energy’s (DOE’s) regulatory requirements relevant to the Price-Anderson Amendments Act (PAAA, also referred to as nuclear safety), worker safety and health (WSH), and classified information security (CIS) that are enforceable under the DOE enforcement program; describes the DOE enforcement process; and provides an overview of Los Alamos National Laboratory’s (LANL’s) internal compliance program relative to these DOE regulatory requirements. The LANL PAAA Program is responsible for maintaining LANL’s internal compliance program, which ensures the prompt identification, screening, and reporting of noncompliances to DOE regulatory requirements pertaining to nuclear safety, WSH, and CIS to build the strongest mitigation position for the Laboratory with respect to civil or other penalties.
A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

Directory of Open Access Journals (Sweden)

Guido W. Grimm

2006-01-01

Full Text Available The multi-copy internal transcribed spacer (ITS region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation instead of the full (partly redundant original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.

A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analysis of 606 Sequences

Science.gov (United States)

Grimm, Guido W.; Renner, Susanne S.; Stamatakis, Alexandros; Hemleben, Vera

2007-01-01

The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly. PMID:19455198
Lmx1b-targeted cis-regulatory modules involved in limb dorsalization.

Science.gov (United States)

Haro, Endika; Watson, Billy A; Feenstra, Jennifer M; Tegeler, Luke; Pira, Charmaine U; Mohan, Subburaman; Oberg, Kerby C

2017-06-01

Lmx1b is a homeodomain transcription factor responsible for limb dorsalization. Despite striking double-ventral (loss-of-function) and double-dorsal (gain-of-function) limb phenotypes, no direct gene targets in the limb have been confirmed. To determine direct targets, we performed a chromatin immunoprecipitation against Lmx1b in mouse limbs at embryonic day 12.5 followed by next-generation sequencing (ChIP-seq). Nearly 84% ( n =617) of the Lmx1b-bound genomic intervals (LBIs) identified overlap with chromatin regulatory marks indicative of potential cis -regulatory modules (PCRMs). In addition, 73 LBIs mapped to CRMs that are known to be active during limb development. We compared Lmx1b-bound PCRMs with genes regulated by Lmx1b and found 292 PCRMs within 1 Mb of 254 Lmx1b-regulated genes. Gene ontological analysis suggests that Lmx1b targets extracellular matrix production, bone/joint formation, axonal guidance, vascular development, cell proliferation and cell movement. We validated the functional activity of a PCRM associated with joint-related Gdf5 that provides a mechanism for Lmx1b-mediated joint modification and a PCRM associated with Lmx1b that suggests a role in autoregulation. This is the first report to describe genome-wide Lmx1b binding during limb development, directly linking Lmx1b to targets that accomplish limb dorsalization. © 2017. Published by The Company of Biologists Ltd.
Bayesian centroid estimation for motif discovery.

Science.gov (United States)

Carvalho, Luis

2013-01-01

Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.
Bayesian centroid estimation for motif discovery.

Directory of Open Access Journals (Sweden)

Luis Carvalho

Full Text Available Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.
POWRS: position-sensitive motif discovery.

Directory of Open Access Journals (Sweden)

Ian W Davis

Full Text Available Transcription factors and the short, often degenerate DNA sequences they recognize are central regulators of gene expression, but their regulatory code is challenging to dissect experimentally. Thus, computational approaches have long been used to identify putative regulatory elements from the patterns in promoter sequences. Here we present a new algorithm "POWRS" (POsition-sensitive WoRd Set for identifying regulatory sequence motifs, specifically developed to address two common shortcomings of existing algorithms. First, POWRS uses the position-specific enrichment of regulatory elements near transcription start sites to significantly increase sensitivity, while providing new information about the preferred localization of those elements. Second, POWRS forgoes position weight matrices for a discrete motif representation that appears more resistant to over-generalization. We apply this algorithm to discover sequences related to constitutive, high-level gene expression in the model plant Arabidopsis thaliana, and then experimentally validate the importance of those elements by systematically mutating two endogenous promoters and measuring the effect on gene expression levels. This provides a foundation for future efforts to rationally engineer gene expression in plants, a problem of great importance in developing biotech crop varieties.BSD-licensed Python code at http://grassrootsbio.com/papers/powrs/.
ARG-walker: inference of individual specific strengths of meiotic recombination hotspots by population genomics analysis.

Science.gov (United States)

Chen, Hao; Yang, Peng; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2015-01-01

Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots.
Thermodynamic state ensemble models of cis-regulation.

Directory of Open Access Journals (Sweden)

Marc S Sherman

Full Text Available A major goal in computational biology is to develop models that accurately predict a gene's expression from its surrounding regulatory DNA. Here we present one class of such models, thermodynamic state ensemble models. We describe the biochemical derivation of the thermodynamic framework in simple terms, and lay out the mathematical components that comprise each model. These components include (1 the possible states of a promoter, where a state is defined as a particular arrangement of transcription factors bound to a DNA promoter, (2 the binding constants that describe the affinity of the protein-protein and protein-DNA interactions that occur in each state, and (3 whether each state is capable of transcribing. Using these components, we demonstrate how to compute a cis-regulatory function that encodes the probability of a promoter being active. Our intention is to provide enough detail so that readers with little background in thermodynamics can compose their own cis-regulatory functions. To facilitate this goal, we also describe a matrix form of the model that can be easily coded in any programming language. This formalism has great flexibility, which we show by illustrating how phenomena such as competition between transcription factors and cooperativity are readily incorporated into these models. Using this framework, we also demonstrate that Michaelis-like functions, another class of cis-regulatory models, are a subset of the thermodynamic framework with specific assumptions. By recasting Michaelis-like functions as thermodynamic functions, we emphasize the relationship between these models and delineate the specific circumstances representable by each approach. Application of thermodynamic state ensemble models is likely to be an important tool in unraveling the physical basis of combinatorial cis-regulation and in generating formalisms that accurately predict gene expression from DNA sequence.
Regulation of TCF ETS-domain transcription factors by helix-loop-helix motifs.

Science.gov (United States)

Stinson, Julie; Inoue, Toshiaki; Yates, Paula; Clancy, Anne; Norton, John D; Sharrocks, Andrew D

2003-08-15

DNA binding by the ternary complex factor (TCF) subfamily of ETS-domain transcription factors is tightly regulated by intramolecular and intermolecular interactions. The helix-loop-helix (HLH)-containing Id proteins are trans-acting negative regulators of DNA binding by the TCFs. In the TCF, SAP-2/Net/ERP, intramolecular inhibition of DNA binding is promoted by the cis-acting NID region that also contains an HLH-like motif. The NID also acts as a transcriptional repression domain. Here, we have studied the role of HLH motifs in regulating DNA binding and transcription by the TCF protein SAP-1 and how Cdk-mediated phosphorylation affects the inhibitory activity of the Id proteins towards the TCFs. We demonstrate that the NID region of SAP-1 is an autoinhibitory motif that acts to inhibit DNA binding and also functions as a transcription repression domain. This region can be functionally replaced by fusion of Id proteins to SAP-1, whereby the Id moiety then acts to repress DNA binding in cis. Phosphorylation of the Ids by cyclin-Cdk complexes results in reduction in protein-protein interactions between the Ids and TCFs and relief of their DNA-binding inhibitory activity. In revealing distinct mechanisms through which HLH motifs modulate the activity of TCFs, our results therefore provide further insight into the role of HLH motifs in regulating TCF function and how the inhibitory properties of the trans-acting Id HLH proteins are themselves regulated by phosphorylation.
Mouse transgenesis identifies conserved functional enhancers and cis-regulatory motif in the vertebrate LIM homeobox gene Lhx2 locus.

Directory of Open Access Journals (Sweden)

Alison P Lee

Full Text Available The vertebrate Lhx2 is a member of the LIM homeobox family of transcription factors. It is essential for the normal development of the forebrain, eye, olfactory system and liver as well for the differentiation of lymphoid cells. However, despite the highly restricted spatio-temporal expression pattern of Lhx2, nothing is known about its transcriptional regulation. In mammals and chicken, Crb2, Dennd1a and Lhx2 constitute a conserved linkage block, while the intervening Dennd1a is lost in the fugu Lhx2 locus. To identify functional enhancers of Lhx2, we predicted conserved noncoding elements (CNEs in the human, mouse and fugu Crb2-Lhx2 loci and assayed their function in transgenic mouse at E11.5. Four of the eight CNE constructs tested functioned as tissue-specific enhancers in specific regions of the central nervous system and the dorsal root ganglia (DRG, recapitulating partial and overlapping expression patterns of Lhx2 and Crb2 genes. There was considerable overlap in the expression domains of the CNEs, which suggests that the CNEs are either redundant enhancers or regulating different genes in the locus. Using a large set of CNEs (810 CNEs associated with transcription factor-encoding genes that express predominantly in the central nervous system, we predicted four over-represented 8-mer motifs that are likely to be associated with expression in the central nervous system. Mutation of one of them in a CNE that drove reporter expression in the neural tube and DRG abolished expression in both domains indicating that this motif is essential for expression in these domains. The failure of the four functional enhancers to recapitulate the complete expression pattern of Lhx2 at E11.5 indicates that there must be other Lhx2 enhancers that are either located outside the region investigated or divergent in mammals and fishes. Other approaches such as sequence comparison between multiple mammals are required to identify and characterize such enhancers.
Transformation of Migration Flows Between the Russian Far East and CIS and non-CIS States

Directory of Open Access Journals (Sweden)

Motrich E. L.

2010-06-01

Full Text Available Basic trends in the migration processes in the Russian Far East are shown. Special emphasis is placed on the transformation of migration interactions with CIS and non-CIS countries both at the level of the region as a whole, and at the level of the Far Eastern territories of the Russian Federation. An extent of using foreign labor in different periods of the Russian Far East socio-economic development and the regulatory support of this process are shown. Prospects for attracting and utilizing foreign labor are stated
Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition.

Directory of Open Access Journals (Sweden)

Arindam Deb

Full Text Available Combinations of cis-regulatory elements (CREs present at the promoters facilitate the binding of several transcription factors (TFs, thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic
Inferring regulatory networks from experimental morphological phenotypes: a computational method reverse-engineers planarian regeneration.

Directory of Open Access Journals (Sweden)

Daniel Lobo

2015-06-01

Full Text Available Transformative applications in biomedicine require the discovery of complex regulatory networks that explain the development and regeneration of anatomical structures, and reveal what external signals will trigger desired changes of large-scale pattern. Despite recent advances in bioinformatics, extracting mechanistic pathway models from experimental morphological data is a key open challenge that has resisted automation. The fundamental difficulty of manually predicting emergent behavior of even simple networks has limited the models invented by human scientists to pathway diagrams that show necessary subunit interactions but do not reveal the dynamics that are sufficient for complex, self-regulating pattern to emerge. To finally bridge the gap between high-resolution genetic data and the ability to understand and control patterning, it is critical to develop computational tools to efficiently extract regulatory pathways from the resultant experimental shape phenotypes. For example, planarian regeneration has been studied for over a century, but despite increasing insight into the pathways that control its stem cells, no constructive, mechanistic model has yet been found by human scientists that explains more than one or two key features of its remarkable ability to regenerate its correct anatomical pattern after drastic perturbations. We present a method to infer the molecular products, topology, and spatial and temporal non-linear dynamics of regulatory networks recapitulating in silico the rich dataset of morphological phenotypes resulting from genetic, surgical, and pharmacological experiments. We demonstrated our approach by inferring complete regulatory networks explaining the outcomes of the main functional regeneration experiments in the planarian literature; By analyzing all the datasets together, our system inferred the first systems-biology comprehensive dynamical model explaining patterning in planarian regeneration. This method
Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions.

Directory of Open Access Journals (Sweden)

Yinyin Yuan

Full Text Available Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/.
Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

DEFF Research Database (Denmark)

Richards, Stephen; Liu, Yue; Bettencourt, Brian R.

2005-01-01

years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...
PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

Science.gov (United States)

Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

2001-01-01

Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila.

Directory of Open Access Journals (Sweden)

Margaret C W Ho

2009-11-01

Full Text Available It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs. How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.
Functional dissection of the promoter of the pollen-specific gene NTP303 reveals a novel pollen-specific, and conserved cis-regulatory element.

Science.gov (United States)

Weterings, K; Schrauwen, J; Wullems, G; Twell, D

1995-07-01

Regulatory elements within the promoter of the pollen-specific NTP303 gene from tobacco were analysed by transient and stable expression analyses. Analysis of precisely targeted mutations showed that the NTP303 promoter is not regulated by any of the previously described pollen-specific cis-regulatory elements. However, two adjacent regions from -103 to -86 bp and from -86 to -59 bp were shown to contain sequences which positively regulated the NTP303 promoter. Both of these regions were capable of driving pollen-specific expression from a heterologous promoter, independent of orientation and in an additive manner. The boundaries of the minimal, functional NTP303 promoter were determined to lie within the region -86 to -51 bp. The sequence AAATGA localized from -94 to -89 bp was identified as a novel cis-acting element, of which the TGA triplet was shown to comprise an active part. This element was shown to be completely conserved in the similarly regulated promoter of the Bp 10 gene from Brassica napus encoding a homologue of the NTP303 gene.
Learning a Markov Logic network for supervised gene regulatory network inference.

Science.gov (United States)

Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence

2013-09-12

Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a
Inferring the role of transcription factors in regulatory networks

Directory of Open Access Journals (Sweden)

Le Borgne Michel

2008-05-01

Full Text Available Abstract Background Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. Results We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges, and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions, by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions. In addition, we report predictions for 14.5% of all interactions. Conclusion Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine
Microevolution of cis-regulatory elements: an example from the pair-rule segmentation gene fushi tarazu in the Drosophila melanogaster subgroup.

Directory of Open Access Journals (Sweden)

Mohammed Bakkali

Full Text Available The importance of non-coding DNAs that control transcription is ever noticeable, but the characterization and analysis of the evolution of such DNAs presents challenges not found in the analysis of coding sequences. In this study of the cis-regulatory elements of the pair rule segmentation gene fushi tarazu (ftz I report the DNA sequences of ftz's zebra element (promoter and a region containing the proximal enhancer from a total of 45 fly lines belonging to several populations of the species Drosophila melanogaster, D. simulans, D. sechellia, D. mauritiana, D. yakuba, D. teissieri, D. orena and D. erecta. Both elements evolve at slower rate than ftz synonymous sites, thus reflecting their functional importance. The promoter evolves more slowly than the average for ftz's coding sequence while, on average, the enhancer evolves more rapidly, suggesting more functional constraint and effective purifying selection on the former. Comparative analysis of the number and nature of base substitutions failed to detect significant evidence for positive/adaptive selection in transcription-factor-binding sites. These seem to evolve at similar rates to regions not known to bind transcription factors. Although this result reflects the evolutionary flexibility of the transcription factor binding sites, it also suggests a complex and still not completely understood nature of even the characterized cis-regulatory sequences. The latter seem to contain more functional parts than those currently identified, some of which probably transcription factor binding. This study illustrates ways in which functional assignments of sequences within cis-acting sequences can be used in the search for adaptive evolution, but also highlights difficulties in how such functional assignment and analysis can be carried out.

The evolutionary capacitor HSP90 buffers the regulatory effects of mammalian endogenous retroviruses.

Science.gov (United States)

Hummel, Barbara; Hansen, Erik C; Yoveva, Aneliya; Aprile-Garcia, Fernando; Hussong, Rebecca; Sawarkar, Ritwick

2017-03-01

Understanding how genotypes are linked to phenotypes is important in biomedical and evolutionary studies. The chaperone heat-shock protein 90 (HSP90) buffers genetic variation by stabilizing proteins with variant sequences, thereby uncoupling phenotypes from genotypes. Here we report an unexpected role of HSP90 in buffering cis-regulatory variation affecting gene expression. By using the tripartite-motif-containing 28 (TRIM28; also known as KAP1)-mediated epigenetic pathway, HSP90 represses the regulatory influence of endogenous retroviruses (ERVs) on neighboring genes that are critical for mouse development. Our data based on natural variations in the mouse genome show that genes respond to HSP90 inhibition in a manner dependent on their genomic location with regard to strain-specific ERV-insertion sites. The evolutionary-capacitor function of HSP90 may thus have facilitated the exaptation of ERVs as key modifiers of gene expression and morphological diversification. Our findings add a new regulatory layer through which HSP90 uncouples phenotypic outcomes from individual genotypes.
RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay.

Science.gov (United States)

Dean, Kimberly M; Grayhack, Elizabeth J

2012-12-01

We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression.
In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa.

Science.gov (United States)

Kaur, Amritpreet; Pati, Pratap Kumar; Pati, Aparna Maitra; Nagpal, Avinash Kaur

2017-01-01

Pathogenesis related (PR) proteins are low molecular weight family of proteins induced in plants under various biotic and abiotic stresses. They play an important role in plant-defense mechanism. PRs have wide range of functions, acting as hydrolases, peroxidases, chitinases, anti-fungal, protease inhibitors etc. In the present study, an attempt has been made to analyze promoter regions of PR1, PR2, PR5, PR9, PR10 and PR12 of Arabidopsis thaliana and Oryza sativa. Analysis of cis-element distribution revealed the functional multiplicity of PRs and provides insight into the gene regulation. CpG islands are observed only in rice PRs, which indicates that monocot genome contains more GC rich motifs than dicots. Tandem repeats were also observed in 5' UTR of PR genes. Thus, the present study provides an understanding of regulation of PR genes and their versatile roles in plants.
Retinal Expression of the Drosophila eyes absent Gene Is Controlled by Several Cooperatively Acting Cis-regulatory Elements

Science.gov (United States)

Neuman, Sarah D.; Bashirullah, Arash; Kumar, Justin P.

2016-01-01

The eyes absent (eya) gene of the fruit fly, Drosophila melanogaster, is a member of an evolutionarily conserved gene regulatory network that controls eye formation in all seeing animals. The loss of eya leads to the complete elimination of the compound eye while forced expression of eya in non-retinal tissues is sufficient to induce ectopic eye formation. Within the developing retina eya is expressed in a dynamic pattern and is involved in tissue specification/determination, cell proliferation, apoptosis, and cell fate choice. In this report we explore the mechanisms by which eya expression is spatially and temporally governed in the developing eye. We demonstrate that multiple cis-regulatory elements function cooperatively to control eya transcription and that spacing between a pair of enhancer elements is important for maintaining correct gene expression. Lastly, we show that the loss of eya expression in sine oculis (so) mutants is the result of massive cell death and a progressive homeotic transformation of retinal progenitor cells into head epidermis. PMID:27930646
A Novel Dual-cre Motif Enables Two-Way Autoregulation of CcpA in Clostridium acetobutylicum.

Science.gov (United States)

Zhang, Lu; Liu, Yanqiang; Yang, Yunpeng; Jiang, Weihong; Gu, Yang

2018-04-15

The master regulator CcpA (catabolite control protein A) manages a large and complex regulatory network that is essential for cellular physiology and metabolism in Gram-positive bacteria. Although CcpA can affect the expression of target genes by binding to a cis -acting catabolite-responsive element ( cre ), whether and how the expression of CcpA is regulated remain poorly explored. Here, we report a novel dual- cre motif that is employed by the CcpA in Clostridium acetobutylicum , a typical solventogenic Clostridium species, for autoregulation. Two cre sites are involved in CcpA autoregulation, and they reside in the promoter and coding regions of CcpA. In this dual- cre motif, cre P , in the promoter region, positively regulates ccpA transcription, whereas cre ORF , in the coding region, negatively regulates this transcription, thus enabling two-way autoregulation of CcpA. Although CcpA bound cre P more strongly than cre ORF in vitro , the in vivo assay showed that cre ORF -based repression dominates CcpA autoregulation during the entire fermentation. Finally, a synonymous mutation of cre ORF was made within the coding region, achieving an increased intracellular CcpA expression and improved cellular performance. This study provides new insights into the regulatory role of CcpA in C. acetobutylicum and, moreover, contributes a new engineering strategy for this industrial strain. IMPORTANCE CcpA is known to be a key transcription factor in Gram-positive bacteria. However, it is still unclear whether and how the intracellular CcpA level is regulated, which may be essential for maintaining normal cell physiology and metabolism. We discovered here that CcpA employs a dual- cre motif to autoregulate, enabling dynamic control of its own expression level during the entire fermentation process. This finding answers the questions above and fills a void in our understanding of the regulatory network of CcpA. Interference in CcpA autoregulation leads to improved cellular
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

Science.gov (United States)

Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

2015-09-29

In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Computational analyses of synergism in small molecular network motifs.

Directory of Open Access Journals (Sweden)

Yili Zhang

2014-03-01

Full Text Available Cellular functions and responses to stimuli are controlled by complex regulatory networks that comprise a large diversity of molecular components and their interactions. However, achieving an intuitive understanding of the dynamical properties and responses to stimuli of these networks is hampered by their large scale and complexity. To address this issue, analyses of regulatory networks often focus on reduced models that depict distinct, reoccurring connectivity patterns referred to as motifs. Previous modeling studies have begun to characterize the dynamics of small motifs, and to describe ways in which variations in parameters affect their responses to stimuli. The present study investigates how variations in pairs of parameters affect responses in a series of ten common network motifs, identifying concurrent variations that act synergistically (or antagonistically to alter the responses of the motifs to stimuli. Synergism (or antagonism was quantified using degrees of nonlinear blending and additive synergism. Simulations identified concurrent variations that maximized synergism, and examined the ways in which it was affected by stimulus protocols and the architecture of a motif. Only a subset of architectures exhibited synergism following paired changes in parameters. The approach was then applied to a model describing interlocked feedback loops governing the synthesis of the CREB1 and CREB2 transcription factors. The effects of motifs on synergism for this biologically realistic model were consistent with those for the abstract models of single motifs. These results have implications for the rational design of combination drug therapies with the potential for synergistic interactions.
Nitrogen transporter and assimilation genes exhibit developmental stage-selective expression in maize (Zea mays L.) associated with distinct cis-acting promoter motifs.

Science.gov (United States)

Liseron-Monfils, Christophe; Bi, Yong-Mei; Downs, Gregory S; Wu, Wenqing; Signorelli, Tara; Lu, Guangwen; Chen, Xi; Bondo, Eddie; Zhu, Tong; Lukens, Lewis N; Colasanti, Joseph; Rothstein, Steven J; Raizada, Manish N

2013-10-01

Nitrogen is considered the most limiting nutrient for maize (Zea mays L.), but there is limited understanding of the regulation of nitrogen-related genes during maize development. An Affymetrix 82K maize array was used to analyze the expression of ≤ 46 unique nitrogen uptake and assimilation probes in 50 maize tissues from seedling emergence to 31 d after pollination. Four nitrogen-related expression clusters were identified in roots and shoots corresponding to, or overlapping, juvenile, adult, and reproductive phases of development. Quantitative real time PCR data was consistent with the existence of these distinct expression clusters. Promoters corresponding to each cluster were screened for over-represented cis-acting elements. The 8-bp distal motif of the Arabidopsis 43-bp nitrogen response element (NRE) was over-represented in nitrogen-related maize gene promoters. This conserved motif, referred to here as NRE43-d8, was previously shown to be critical for nitrate-activated transcription of nitrate reductase (NIA1) and nitrite reductase (NIR1) by the NIN-LIKE PROTEIN 6 (NLP6) in Arabidopsis. Here, NRE43-d8 was over-represented in the promoters of maize nitrate and ammonium transporter genes, specifically those that showed peak expression during early-stage vegetative development. This result predicts an expansion of the NRE-NLP6 regulon and suggests that it may have a developmental component in maize. We also report leaf expression of putative orthologs of nitrite transporters (NiTR1), a transporter not previously reported in maize. We conclude by discussing how each of the four transcriptional modules may be responsible for the different nitrogen uptake and assimilation requirements of leaves and roots at different stages of maize development.
Simultaneous genome-wide inference of physical, genetic, regulatory, and functional pathway components.

Directory of Open Access Journals (Sweden)

Christopher Y Park

2010-11-01

Full Text Available Biomolecular pathways are built from diverse types of pairwise interactions, ranging from physical protein-protein interactions and modifications to indirect regulatory relationships. One goal of systems biology is to bridge three aspects of this complexity: the growing body of high-throughput data assaying these interactions; the specific interactions in which individual genes participate; and the genome-wide patterns of interactions in a system of interest. Here, we describe methodology for simultaneously predicting specific types of biomolecular interactions using high-throughput genomic data. This results in a comprehensive compendium of whole-genome networks for yeast, derived from ∼3,500 experimental conditions and describing 30 interaction types, which range from general (e.g. physical or regulatory to specific (e.g. phosphorylation or transcriptional regulation. We used these networks to investigate molecular pathways in carbon metabolism and cellular transport, proposing a novel connection between glycogen breakdown and glucose utilization supported by recent publications. Additionally, 14 specific predicted interactions in DNA topological change and protein biosynthesis were experimentally validated. We analyzed the systems-level network features within all interactomes, verifying the presence of small-world properties and enrichment for recurring network motifs. This compendium of physical, synthetic, regulatory, and functional interaction networks has been made publicly available through an interactive web interface for investigators to utilize in future research at http://function.princeton.edu/bioweaver/.
A 6-Nucleotide Regulatory Motif within the AbcR Small RNAs of Brucella abortus Mediates Host-Pathogen Interactions.

Science.gov (United States)

Sheehan, Lauren M; Caswell, Clayton C

2017-06-06

In Brucella abortus , two small RNAs (sRNAs), AbcR1 and AbcR2, are responsible for regulating transcripts encoding ABC-type transport systems. AbcR1 and AbcR2 are required for Brucella virulence, as a double chromosomal deletion of both sRNAs results in attenuation in mice. Although these sRNAs are responsible for targeting transcripts for degradation, the mechanism utilized by the AbcR sRNAs to regulate mRNA in Brucella has not been described. Here, two motifs (M1 and M2) were identified in AbcR1 and AbcR2, and complementary motif sequences were defined in AbcR-regulated transcripts. Site-directed mutagenesis of M1 or M2 or of both M1 and M2 in the sRNAs revealed transcripts to be targeted by one or both motifs. Electrophoretic mobility shift assays revealed direct, concentration-dependent binding of both AbcR sRNAs to a target mRNA sequence. These experiments genetically and biochemically characterized two indispensable motifs within the AbcR sRNAs that bind to and regulate transcripts. Additionally, cellular and animal models of infection demonstrated that only M2 in the AbcR sRNAs is required for Brucella virulence. Furthermore, one of the M2-regulated targets, BAB2_0612, was found to be critical for the virulence of B. abortus in a mouse model of infection. Although these sRNAs are highly conserved among Alphaproteobacteria , the present report displays how gene regulation mediated by the AbcR sRNAs has diverged to meet the intricate regulatory requirements of each particular organism and its unique biological niche. IMPORTANCE Small RNAs (sRNAs) are important components of bacterial regulation, allowing organisms to quickly adapt to changes in their environments. The AbcR sRNAs are highly conserved throughout the Alphaproteobacteria and negatively regulate myriad transcripts, many encoding ABC-type transport systems. In Brucella abortus , AbcR1 and AbcR2 are functionally redundant, as only a double abcR1 abcR2 ( abcR1 / 2 ) deletion results in attenuation in
Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

Science.gov (United States)

Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

2014-01-01

Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994
Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

NARCIS (Netherlands)

Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

2009-01-01

Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori
On the origin of distribution patterns of motifs in biological networks

Directory of Open Access Journals (Sweden)

Lesk Arthur M

2008-08-01

Full Text Available Abstract Background Inventories of small subgraphs in biological networks have identified commonly-recurring patterns, called motifs. The inference that these motifs have been selected for function rests on the idea that their occurrences are significantly more frequent than random. Results Our analysis of several large biological networks suggests, in contrast, that the frequencies of appearance of common subgraphs are similar in natural and corresponding random networks. Conclusion Indeed, certain topological features of biological networks give rise naturally to the common appearance of the motifs. We therefore question whether frequencies of occurrences are reasonable evidence that the structures of motifs have been selected for their functional contribution to the operation of networks.
A comparative study of covariance selection models for the inference of gene regulatory networks.

Science.gov (United States)

Stifanelli, Patrizia F; Creanza, Teresa M; Anglani, Roberto; Liuzzi, Vania C; Mukherjee, Sayan; Schena, Francesco P; Ancona, Nicola

2013-10-01

The inference, or 'reverse-engineering', of gene regulatory networks from expression data and the description of the complex dependency structures among genes are open issues in modern molecular biology. In this paper we compared three regularized methods of covariance selection for the inference of gene regulatory networks, developed to circumvent the problems raising when the number of observations n is smaller than the number of genes p. The examined approaches provided three alternative estimates of the inverse covariance matrix: (a) the 'PINV' method is based on the Moore-Penrose pseudoinverse, (b) the 'RCM' method performs correlation between regression residuals and (c) 'ℓ(2C)' method maximizes a properly regularized log-likelihood function. Our extensive simulation studies showed that ℓ(2C) outperformed the other two methods having the most predictive partial correlation estimates and the highest values of sensitivity to infer conditional dependencies between genes even when a few number of observations was available. The application of this method for inferring gene networks of the isoprenoid biosynthesis pathways in Arabidopsis thaliana allowed to enlighten a negative partial correlation coefficient between the two hubs in the two isoprenoid pathways and, more importantly, provided an evidence of cross-talk between genes in the plastidial and the cytosolic pathways. When applied to gene expression data relative to a signature of HRAS oncogene in human cell cultures, the method revealed 9 genes (p-value<0.0005) directly interacting with HRAS, sharing the same Ras-responsive binding site for the transcription factor RREB1. This result suggests that the transcriptional activation of these genes is mediated by a common transcription factor downstream of Ras signaling. Software implementing the methods in the form of Matlab scripts are available at: http://users.ba.cnr.it/issia/iesina18/CovSelModelsCodes.zip. Copyright © 2013 The Authors. Published by
MicroRNA signature of cis-platin resistant vs. cis-platin sensitive ovarian cancer cell lines

Directory of Open Access Journals (Sweden)

Kumar Smriti

2011-09-01

Full Text Available Abstract Background Ovarian cancer is the leading cause of death from gynecologic cancer in women worldwide. According to the National Cancer Institute, ovarian cancer has the highest mortality rate among all the reproductive cancers in women. Advanced stage diagnosis and chemo/radio-resistance is a major obstacle in treating advanced ovarian cancer. The most commonly employed chemotherapeutic drug for ovarian cancer treatment is cis-platin. As with most chemotherapeutic drugs, many patients eventually become resistant to cis-platin and therefore, diminishing its effect. The efficacy of current treatments may be improved by increasing the sensitivity of cancer cells to chemo/radiation therapies. Methods The present study is focused on identifying the differential expression of regulatory microRNAs (miRNAs between cis-platin sensitive (A2780, and cis-platin resistant (A2780/CP70 cell lines. Cell proliferation assays were conducted to test the sensitivity of the two cell lines to cis-platin. Differential expression patterns of miRNA between cis-platin sensitive and cis-platin resistant cell lines were analyzed using novel LNA technology. Results Our results revealed changes in expression of 11 miRNAs out of 1,500 miRNAs analyzed. Out of the 11 miRNAs identified, 5 were up-regulated in the A2780/CP70 cell line and 6 were down regulated as compared to cis-platin sensitive A2780 cells. Our microRNA data was further validated by quantitative real-time PCR for these selected miRNAs. Ingenuity Pathway Analysis (IPA and Kyoto Encyclopedia of Genes and Genomes (KEGG analysis was performed for the selected miRNAs and their putative targets to identify the potential pathways and networks involved in cis-platin resistance. Conclusions Our data clearly showed the differential expression of 11 miRNAs in cis-platin resistant cells, which could potentially target many important pathways including MAPK, TGF-β signaling, actin cytoskeleton, ubiquitin mediated
Inference of gene regulatory networks from time series by Tsallis entropy

Directory of Open Access Journals (Sweden)

de Oliveira Evaldo A

2011-05-01

Full Text Available Abstract Background The inference of gene regulatory networks (GRNs from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information, a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 �
DNA regulatory motif selection based on support vector machine ...

African Journals Online (AJOL)

... machine (SVM) and its application in microarray experiment of Kashin-Beck disease. ... speed and amount of the corresponding mRNA in gene replication process. ... and revealed that some motifs may be related to the immune reactions.
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

Directory of Open Access Journals (Sweden)

Rahul Karnik

Full Text Available The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
Improving the extraction of complex regulatory events from scientific text by using ontology-based inference.

Science.gov (United States)

Kim, Jung-Jae; Rebholz-Schuhmann, Dietrich

2011-10-06

The extraction of complex events from biomedical text is a challenging task and requires in-depth semantic analysis. Previous approaches associate lexical and syntactic resources with ontologies for the semantic analysis, but fall short in testing the benefits from the use of domain knowledge. We developed a system that deduces implicit events from explicitly expressed events by using inference rules that encode domain knowledge. We evaluated the system with the inference module on three tasks: First, when tested against a corpus with manually annotated events, the inference module of our system contributes 53.2% of correct extractions, but does not cause any incorrect results. Second, the system overall reproduces 33.1% of the transcription regulatory events contained in RegulonDB (up to 85.0% precision) and the inference module is required for 93.8% of the reproduced events. Third, we applied the system with minimum adaptations to the identification of cell activity regulation events, confirming that the inference improves the performance of the system also on this task. Our research shows that the inference based on domain knowledge plays a significant role in extracting complex events from text. This approach has great potential in recognizing the complex concepts of such biomedical ontologies as Gene Ontology in the literature.
Improving the extraction of complex regulatory events from scientific text by using ontology-based inference

Directory of Open Access Journals (Sweden)

Kim Jung-jae

2011-10-01

Full Text Available Abstract Background The extraction of complex events from biomedical text is a challenging task and requires in-depth semantic analysis. Previous approaches associate lexical and syntactic resources with ontologies for the semantic analysis, but fall short in testing the benefits from the use of domain knowledge. Results We developed a system that deduces implicit events from explicitly expressed events by using inference rules that encode domain knowledge. We evaluated the system with the inference module on three tasks: First, when tested against a corpus with manually annotated events, the inference module of our system contributes 53.2% of correct extractions, but does not cause any incorrect results. Second, the system overall reproduces 33.1% of the transcription regulatory events contained in RegulonDB (up to 85.0% precision and the inference module is required for 93.8% of the reproduced events. Third, we applied the system with minimum adaptations to the identification of cell activity regulation events, confirming that the inference improves the performance of the system also on this task. Conclusions Our research shows that the inference based on domain knowledge plays a significant role in extracting complex events from text. This approach has great potential in recognizing the complex concepts of such biomedical ontologies as Gene Ontology in the literature.

Connectivity in the yeast cell cycle transcription network: inferences from neural networks.

Directory of Open Access Journals (Sweden)

Christopher E Hart

2006-12-01

Full Text Available A current challenge is to develop computational approaches to infer gene network regulatory relationships based on multiple types of large-scale functional genomic data. We find that single-layer feed-forward artificial neural network (ANN models can effectively discover gene network structure by integrating global in vivo protein:DNA interaction data (ChIP/Array with genome-wide microarray RNA data. We test this on the yeast cell cycle transcription network, which is composed of several hundred genes with phase-specific RNA outputs. These ANNs were robust to noise in data and to a variety of perturbations. They reliably identified and ranked 10 of 12 known major cell cycle factors at the top of a set of 204, based on a sum-of-squared weights metric. Comparative analysis of motif occurrences among multiple yeast species independently confirmed relationships inferred from ANN weights analysis. ANN models can capitalize on properties of biological gene networks that other kinds of models do not. ANNs naturally take advantage of patterns of absence, as well as presence, of factor binding associated with specific expression output; they are easily subjected to in silico "mutation" to uncover biological redundancies; and they can use the full range of factor binding values. A prominent feature of cell cycle ANNs suggested an analogous property might exist in the biological network. This postulated that "network-local discrimination" occurs when regulatory connections (here between MBF and target genes are explicitly disfavored in one network module (G2, relative to others and to the class of genes outside the mitotic network. If correct, this predicts that MBF motifs will be significantly depleted from the discriminated class and that the discrimination will persist through evolution. Analysis of distantly related Schizosaccharomyces pombe confirmed this, suggesting that network-local discrimination is real and complements well-known enrichment of
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data

Directory of Open Access Journals (Sweden)

de los Reyes Benildo G

2008-04-01

Full Text Available Abstract Background Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published data. Integrating these complementary datasets helps infer a mutually consistent transcriptional regulatory network (TRN with strong similarity to the structure of the underlying genetic regulatory modules. Decomposing the TRN into a small set of recurring regulatory patterns, called network motifs (NM, facilitates the inference. Identifying NMs defined by specific transcription factors (TF establishes the framework structure of a TRN and allows the inference of TF-target gene relationship. This paper introduces a computational framework for utilizing data from multiple sources to infer TF-target gene relationships on the basis of NMs. The data include time course gene expression profiles, genome-wide location analysis data, binding sequence data, and gene ontology (GO information. Results The proposed computational framework was tested using gene expression data associated with cell cycle progression in yeast. Among 800 cell cycle related genes, 85 were identified as candidate TFs and classified into four previously defined NMs. The NMs for a subset of TFs are obtained from literature. Support vector machine (SVM classifiers were used to estimate NMs for the remaining TFs. The potential downstream target genes for the TFs were clustered into 34 biologically significant groups. The relationships between TFs and potential target gene clusters were examined by training recurrent neural networks whose topologies mimic the NMs to which the TFs are classified. The identified relationships between TFs and gene clusters were evaluated using the following biological validation and statistical analyses: (1 Gene set enrichment
State of the Art of Fuzzy Methods for Gene Regulatory Networks Inference

Directory of Open Access Journals (Sweden)

Tuqyah Abdullah Al Qazlan

2015-01-01

Full Text Available To address one of the most challenging issues at the cellular level, this paper surveys the fuzzy methods used in gene regulatory networks (GRNs inference. GRNs represent causal relationships between genes that have a direct influence, trough protein production, on the life and the development of living organisms and provide a useful contribution to the understanding of the cellular functions as well as the mechanisms of diseases. Fuzzy systems are based on handling imprecise knowledge, such as biological information. They provide viable computational tools for inferring GRNs from gene expression data, thus contributing to the discovery of gene interactions responsible for specific diseases and/or ad hoc correcting therapies. Increasing computational power and high throughput technologies have provided powerful means to manage these challenging digital ecosystems at different levels from cell to society globally. The main aim of this paper is to report, present, and discuss the main contributions of this multidisciplinary field in a coherent and structured framework.
Core regulatory network motif underlies the ocellar complex patterning in Drosophila melanogaster

Science.gov (United States)

Aguilar-Hidalgo, D.; Lemos, M. C.; Córdoba, A.

2015-03-01

During organogenesis, developmental programs governed by Gene Regulatory Networks (GRN) define the functionality, size and shape of the different constituents of living organisms. Robustness, thus, is an essential characteristic that GRNs need to fulfill in order to maintain viability and reproducibility in a species. In the present work we analyze the robustness of the patterning for the ocellar complex formation in Drosophila melanogaster fly. We have systematically pruned the GRN that drives the development of this visual system to obtain the minimum pathway able to satisfy this pattern. We found that the mechanism underlying the patterning obeys to the dynamics of a 3-nodes network motif with a double negative feedback loop fed by a morphogenetic gradient that triggers the inhibition in a French flag problem fashion. A Boolean modeling of the GRN confirms robustness in the patterning mechanism showing the same result for different network complexity levels. Interestingly, the network provides a steady state solution in the interocellar part of the patterning and an oscillatory regime in the ocelli. This theoretical result predicts that the ocellar pattern may underlie oscillatory dynamics in its genetic regulation.
Investment in the CEE/CIS region

International Nuclear Information System (INIS)

Lemierre, J.

2002-01-01

The energy investments in the Central and Eastern European region and the Commonwealth of Independent States (CIS) region are discussed in this Keynote Address. The message is addressed to regulators and governments. The restructuring of old industries to save energy is highlighted. The regulatory system must undergo a substantial reform. Another message is placed for investors in the energy field. (R.P.)
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

Science.gov (United States)

Catania, Francesco; Lynch, Michael

2010-05-04

In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Metamotifs - a generative model for building families of nucleotide position weight matrices

Directory of Open Access Journals (Sweden)

Down Thomas A

2010-06-01

Full Text Available Abstract Background Development of high-throughput methods for measuring DNA interactions of transcription factors together with computational advances in short motif inference algorithms is expanding our understanding of transcription factor binding site motifs. The consequential growth of sequence motif data sets makes it important to systematically group and categorise regulatory motifs. It has been shown that there are familial tendencies in DNA sequence motifs that are predictive of the family of factors that binds them. Further development of methods that detect and describe familial motif trends has the potential to help in measuring the similarity of novel computational motif predictions to previously known data and sensitively detecting regulatory motifs similar to previously known ones from novel sequence. Results We propose a probabilistic model for position weight matrix (PWM sequence motif families. The model, which we call the 'metamotif' describes recurring familial patterns in a set of motifs. The metamotif framework models variation within a family of sequence motifs. It allows for simultaneous estimation of a series of independent metamotifs from input position weight matrix (PWM motif data and does not assume that all input motif columns contribute to a familial pattern. We describe an algorithm for inferring metamotifs from weight matrix data. We then demonstrate the use of the model in two practical tasks: in the Bayesian NestedMICA model inference algorithm as a PWM prior to enhance motif inference sensitivity, and in a motif classification task where motifs are labelled according to their interacting DNA binding domain. Conclusions We show that metamotifs can be used as PWM priors in the NestedMICA motif inference algorithm to dramatically increase the sensitivity to infer motifs. Metamotifs were also successfully applied to a motif classification problem where sequence motif features were used to predict the family of
Regulatory motifs for CREB-binding protein and Nfe2l2 transcription factors in the upstream enhancer of the mitochondrial uncoupling protein 1 gene.

Science.gov (United States)

Rim, Jong S; Kozak, Leslie P

2002-09-13

Thermogenesis against cold exposure in mammals occurs in brown adipose tissue (BAT) through mitochondrial uncoupling protein (UCP1). Expression of the Ucp1 gene is unique in brown adipocytes and is regulated tightly. The 5'-flanking region of the mouse Ucp1 gene contains cis-acting elements including PPRE, TRE, and four half-site cAMP-responsive elements (CRE) with BAT-specific enhancer elements. In the course of analyzing how these half-site CREs are involved in Ucp1 expression, we found that a DNA regulatory element for NF-E2 overlaps CRE2. Electrophoretic mobility shift assay and competition assays with the CRE2 element indicates that nuclear proteins from BAT, inguinal fat, and retroperitoneal fat tissue interact with the CRE2 motif (CGTCA) in a specific manner. A supershift assay using an antibody against the CRE-binding protein (CREB) shows specific affinity to the complex from CRE2 and nuclear extract of BAT. Additionally, Western blot analysis for phospho-CREB/ATF1 shows an increase in phosphorylation of CREB/ATF1 in HIB-1B cells after norepinephrine treatment. Transient transfection assay using luciferase reporter constructs also indicates that the two half-site CREs are involved in transcriptional regulation of Ucp1 in response to norepinephrine and cAMP. We also show that a second DNA regulatory element for NF-E2 is located upstream of the CRE2 region. This element, which is found in a similar location in the 5'-flanking region of the human and rodent Ucp1 genes, shows specific binding to rat and human NF-E2 by electrophoretic mobility shift assay with nuclear extracts from brown fat. Co-transfections with an Nfe2l2 expression vector and a luciferase reporter construct of the Ucp1 enhancer region provide additional evidence that Nfe2l2 is involved in the regulation of Ucp1 by cAMP-mediated signaling.
Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.

Directory of Open Access Journals (Sweden)

Arnoldo J Müller-Molina

Full Text Available To know the map between transcription factors (TFs and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.
Convergent evolution and mimicry of protein linear motifs in host-pathogen interactions.

Science.gov (United States)

Chemes, Lucía Beatriz; de Prat-Gay, Gonzalo; Sánchez, Ignacio Enrique

2015-06-01

Pathogen linear motif mimics are highly evolvable elements that facilitate rewiring of host protein interaction networks. Host linear motifs and pathogen mimics differ in sequence, leading to thermodynamic and structural differences in the resulting protein-protein interactions. Moreover, the functional output of a mimic depends on the motif and domain repertoire of the pathogen protein. Regulatory evolution mediated by linear motifs can be understood by measuring evolutionary rates, quantifying positive and negative selection and performing phylogenetic reconstructions of linear motif natural history. Convergent evolution of linear motif mimics is widespread among unrelated proteins from viral, prokaryotic and eukaryotic pathogens and can also take place within individual protein phylogenies. Statistics, biochemistry and laboratory models of infection link pathogen linear motifs to phenotypic traits such as tropism, virulence and oncogenicity. In vitro evolution experiments and analysis of natural sequences suggest that changes in linear motif composition underlie pathogen adaptation to a changing environment. Copyright © 2015 Elsevier Ltd. All rights reserved.
A 20 bp cis-acting element is both necessary and sufficient to mediate elicitor response of a maize PRms gene.

Science.gov (United States)

Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B

1995-01-01

Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs with Disjointness Constraints

Science.gov (United States)

Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

2012-01-01

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation. PMID:23304382
An analysis of multi-type relational interactions in FMA using graph motifs with disjointness constraints.

Science.gov (United States)

Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

2012-01-01

The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation.
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins

Science.gov (United States)

Kinjo, Akira R.; Nakamura, Haruki

2012-01-01

Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
MYC cis-Elements in PsMPT Promoter Is Involved in Chilling Response of Paeonia suffruticosa.

Directory of Open Access Journals (Sweden)

Yuxi Zhang

Full Text Available The MPT transports Pi to synthesize ATP. PsMPT, a chilling-induced gene, was previously reported to promote energy metabolism during bud dormancy release in tree peony. In this study, the regulatory elements of PsMPT promoter involved in chilling response were further analyzed. The PsMPT transcript was detected in different tree peony tissues and was highly expressed in the flower organs, including petal, stigma and stamen. An 1174 bp of the PsMPT promoter was isolated by TAIL-PCR, and the PsMPT promoter::GUS transgenic Arabidopsis was generated and analyzed. GUS staining and qPCR showed that the promoter was active in mainly the flower stigma and stamen. Moreover, it was found that the promoter activity was enhanced by chilling, NaCl, GA, ACC and NAA, but inhibited by ABA, mannitol and PEG. In transgenic plants harboring 421 bp of the PsMPT promoter, the GUS gene expression and the activity were significantly increased by chilling treatment. When the fragment from -421 to -408 containing a MYC cis-element was deleted, the chilling response could not be observed. Further mutation analysis confirmed that the MYC element was one of the key motifs responding to chilling in the PsMPT promoter. The present study provides useful information for further investigation of the regulatory mechanism of PsMPT during the endo-dormancy release.
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

Directory of Open Access Journals (Sweden)

Lynch Michael

2010-05-01

Full Text Available Abstract Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1 shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2 are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3 reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Inflammatory gene regulatory networks in amnion cells following cytokine stimulation: translational systems approach to modeling human parturition.

Directory of Open Access Journals (Sweden)

Ruth Li

Full Text Available A majority of the studies examining the molecular regulation of human labor have been conducted using single gene approaches. While the technology to produce multi-dimensional datasets is readily available, the means for facile analysis of such data are limited. The objective of this study was to develop a systems approach to infer regulatory mechanisms governing global gene expression in cytokine-challenged cells in vitro, and to apply these methods to predict gene regulatory networks (GRNs in intrauterine tissues during term parturition. To this end, microarray analysis was applied to human amnion mesenchymal cells (AMCs stimulated with interleukin-1β, and differentially expressed transcripts were subjected to hierarchical clustering, temporal expression profiling, and motif enrichment analysis, from which a GRN was constructed. These methods were then applied to fetal membrane specimens collected in the absence or presence of spontaneous term labor. Analysis of cytokine-responsive genes in AMCs revealed a sterile immune response signature, with promoters enriched in response elements for several inflammation-associated transcription factors. In comparison to the fetal membrane dataset, there were 34 genes commonly upregulated, many of which were part of an acute inflammation gene expression signature. Binding motifs for nuclear factor-κB were prominent in the gene interaction and regulatory networks for both datasets; however, we found little evidence to support the utilization of pathogen-associated molecular pattern (PAMP signaling. The tissue specimens were also enriched for transcripts governed by hypoxia-inducible factor. The approach presented here provides an uncomplicated means to infer global relationships among gene clusters involved in cellular responses to labor-associated signals.
Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

Energy Technology Data Exchange (ETDEWEB)

Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

2004-04-01

The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.
Cis-regulatory control of the nuclear receptor Coup-TF gene in the sea urchin Paracentrotus lividus embryo.

Directory of Open Access Journals (Sweden)

Lamprini G Kalampoki

Full Text Available Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN. The Paracentrotus lividus Coup-TF gene (PlCoup-TF is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5' deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (-532 to -232, was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5' and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3 within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1 or ectopic expression (RE2, RE3. It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN.
Preaxial polydactyly/triphalangeal thumb is associated with changed transcription factor-binding affinity in a family with a novel point mutation in the long-range cis-regulatory element ZRS

DEFF Research Database (Denmark)

Farooq, Muhammad; Troelsen, Jesper T; Boyd, Mette

2010-01-01

A cis-regulatory sequence also known as zone of polarizing activity (ZPA) regulatory sequence (ZRS) located in intron 5 of LMBR1 is essential for expression of sonic hedgehog (SHH) in the developing posterior limb bud mesenchyme. Even though many point mutations causing preaxial duplication defects...... demonstrated a marked difference between wild-type and the mutant probe, which uniquely bound one or several transcription factors extracted from Caco-2 cells. This finding supports a model in which ectopic anterior SHH expression in the developing limb results from abnormal binding of one or more...

Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems.

Science.gov (United States)

Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M

2017-01-01

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
PReMod: a database of genome-wide mammalian cis-regulatory module predictions.

Science.gov (United States)

Ferretti, Vincent; Poitras, Christian; Bergeron, Dominique; Coulombe, Benoit; Robert, François; Blanchette, Mathieu

2007-01-01

We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.
CompariMotif: quick and easy comparisons of sequence motifs.

Science.gov (United States)

Edwards, Richard J; Davey, Norman E; Shields, Denis C

2008-05-15

CompariMotif is a novel tool for making motif-motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs. CompariMotif can be run online at http://bioware.ucd.ie/ and is freely available for academic use as a set of open source Python modules under a GNU General Public License from http://bioinformatics.ucd.ie/shields/software/comparimotif/
cDNA cloning, genomic organization and expression analysis during somatic embryogenesis of the translationally controlled tumor protein (TCTP) gene from Japanese larch (Larix leptolepis).

Science.gov (United States)

Zhang, Li-Feng; Li, Wan-Feng; Han, Su-Ying; Yang, Wen-Hua; Qi, Li-Wang

2013-10-15

A full-length cDNA and genomic sequences of a translationally controlled tumor protein (TCTP) gene were isolated from Japanese larch (Larix leptolepis) and designated LaTCTP. The length of the cDNA was 1, 043 bp and contained a 504 bp open reading frame that encodes a predicted protein of 167 amino acids, characterized by two signature sequences of the TCTP protein family. Analysis of the LaTCTP gene structure indicated four introns and five exons, and it is the largest of all currently known TCTP genes in plants. The 5'-flanking promoter region of LaTCTP was cloned using an improved TAIL-PCR technique. In this region we identified many important potential cis-acting elements, such as a Box-W1 (fungal elicitor responsive element), a CAT-box (cis-acting regulatory element related to meristem expression), a CGTCA-motif (cis-acting regulatory element involved in MeJA-responsiveness), a GT1-motif (light responsive element), a Skn-1-motif (cis-acting regulatory element required for endosperm expression) and a TGA-element (auxin-responsive element), suggesting that expression of LaTCTP is highly regulated. Expression analysis demonstrated ubiquitous localization of LaTCTP mRNA in the roots, stems and needles, high mRNA levels in the embryonal-suspensor mass (ESM), browning embryogenic cultures and mature somatic embryos, and low levels of mRNA at day five during somatic embryogenesis. We suggest that LaTCTP might participate in the regulation of somatic embryo development. These results provide a theoretical basis for understanding the molecular regulatory mechanism of LaTCTP and lay the foundation for artificial regulation of somatic embryogenesis. © 2013.
Mapping of cis-regulatory sites in the promoter of testis-specific stellate genes of Drosophila melanogaster.

Science.gov (United States)

Olenkina, O M; Egorova, K S; Aravin, A A; Naumova, N M; Gvozdev, V A; Olenina, L V

2012-11-01

Tandem Stellate genes organized into two clusters in heterochromatin and euchromatin of the X-chromosome are part of the Ste-Su(Ste) genetic system required for maintenance of male fertility and reproduction of Drosophila melanogaster. Stellate genes encode a regulatory subunit of protein kinase CK2 and are the main targets of germline-specific piRNA-silencing; their derepression leads to appearance of protein crystals in spermatocytes, meiotic disturbances, and male sterility. A short promoter region of 134 bp appears to be sufficient for testis-specific transcription of Stellate, and it contains three closely located cis-regulatory elements called E-boxes. By using reporter analysis, we confirmed a strong functionality of the E-boxes in the Stellate promoter for in vivo transcription. Using selective mutagenesis, we have shown that the presence of the central E-box 2 is preferable to maintain a high-level testis-specific transcription of the reporter gene under the Stellate promoter. The Stellate promoter provides transcription even in heterochromatin, and corresponding mRNAs are translated with the generation of full-size protein products in case of disturbances in the piRNA-silencing process. We have also shown for the first time that the activity of the Stellate promoter is determined by chromatin context of the X-chromosome in male germinal cells, and it increases at about twofold when relocating in autosomes.
Selection against spurious promoter motifs correlates withtranslational efficiency across bacteria

Energy Technology Data Exchange (ETDEWEB)

Froula, Jeffrey L.; Francino, M. Pilar

2007-05-01

Because binding of RNAP to misplaced sites could compromise the efficiency of transcription, natural selection for the optimization of gene expression should regulate the distribution of DNA motifs capable of RNAP-binding across the genome. Here we analyze the distribution of the -10 promoter motifs that bind the {sigma}{sup 70} subunit of RNAP in 42 bacterial genomes. We show that selection on these motifs operates across the genome, maintaining an over-representation of -10 motifs in regulatory sequences while eliminating them from the nonfunctional and, in most cases, from the protein coding regions. In some genomes, however, -10 sites are over-represented in the coding sequences; these sites could induce pauses effecting regulatory roles throughout the length of a transcriptional unit. For nonfunctional sequences, the extent of motif under-representation varies across genomes in a manner that broadly correlates with the number of tRNA genes, a good indicator of translational speed and growth rate. This suggests that minimizing the time invested in gene transcription is an important selective pressure against spurious binding. However, selection against spurious binding is detectable in the reduced genomes of host-restricted bacteria that grow at slow rates, indicating that components of efficiency other than speed may also be important. Minimizing the number of RNAP molecules per cell required for transcription, and the corresponding energetic expense, may be most relevant in slow growers. These results indicate that genome-level properties affecting the efficiency of transcription and translation can respond in an integrated manner to optimize gene expression. The detection of selection against promoter motifs in nonfunctional regions also implies that no sequence may evolve free of selective constraints, at least in the relatively small and unstructured genomes of bacteria.
Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs.

Directory of Open Access Journals (Sweden)

Christopher D Brown

Full Text Available Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL mapping has paralleled the adoption of genome-wide association studies (GWAS for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human
Reconstruction of gene regulatory modules from RNA silencing of IFN-α modulators: experimental set-up and inference method.

Science.gov (United States)

Grassi, Angela; Di Camillo, Barbara; Ciccarese, Francesco; Agnusdei, Valentina; Zanovello, Paola; Amadori, Alberto; Finesso, Lorenzo; Indraccolo, Stefano; Toffolo, Gianna Maria

2016-03-12

Inference of gene regulation from expression data may help to unravel regulatory mechanisms involved in complex diseases or in the action of specific drugs. A challenging task for many researchers working in the field of systems biology is to build up an experiment with a limited budget and produce a dataset suitable to reconstruct putative regulatory modules worth of biological validation. Here, we focus on small-scale gene expression screens and we introduce a novel experimental set-up and a customized method of analysis to make inference on regulatory modules starting from genetic perturbation data, e.g. knockdown and overexpression data. To illustrate the utility of our strategy, it was applied to produce and analyze a dataset of quantitative real-time RT-PCR data, in which interferon-α (IFN-α) transcriptional response in endothelial cells is investigated by RNA silencing of two candidate IFN-α modulators, STAT1 and IFIH1. A putative regulatory module was reconstructed by our method, revealing an intriguing feed-forward loop, in which STAT1 regulates IFIH1 and they both negatively regulate IFNAR1. STAT1 regulation on IFNAR1 was object of experimental validation at the protein level. Detailed description of the experimental set-up and of the analysis procedure is reported, with the intent to be of inspiration for other scientists who want to realize similar experiments to reconstruct gene regulatory modules starting from perturbations of possible regulators. Application of our approach to the study of IFN-α transcriptional response modulators in endothelial cells has led to many interesting novel findings and new biological hypotheses worth of validation.
Identification of cis-regulatory sequences that activate transcription in the suspensor of plant embryos.

Science.gov (United States)

Kawashima, Tomokazu; Wang, Xingjun; Henry, Kelli F; Bi, Yuping; Weterings, Koen; Goldberg, Robert B

2009-03-03

Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the scarlet runner bean (Phaseolus coccineus) G564 gene to understand how genes are activated specifically within the suspensor during early embryo development. Previously, we showed that the G564 upstream region has a block of tandem repeats, which contain a conserved 10-bp motif (GAAAAG(C)/(T)GAA), and that deletion of these repeats results in a loss of suspensor transcription. Here, we use gain-of-function (GOF) experiments with transgenic globular-stage tobacco embryos to show that only 1 of the 5 tandem repeats is required to drive suspensor-specific transcription. Fine-scale deletion and scanning mutagenesis experiments with 1 tandem repeat uncovered a 54-bp region that contains all of the sequences required to activate transcription in the suspensor, including the 10-bp motif (GAAAAGCGAA) and a similar 10-bp-like motif (GAAAAACGAA). Site-directed mutagenesis and GOF experiments indicated that both the 10-bp and 10-bp-like motifs are necessary, but not sufficient to activate transcription in the suspensor, and that a sequence (TTGGT) between the 10-bp and the 10-bp-like motifs is also necessary for suspensor transcription. Together, these data identify sequences that are required to activate transcription in the suspensor of a plant embryo after fertilization.
Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation

Directory of Open Access Journals (Sweden)

Lundberg Cathryn

2008-01-01

Full Text Available Abstract Background Gene expression measurements from breast cancer (BrCa tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays. Results We gathered expression data from 9 published microarray studies examining estrogen receptor positive (ER+ and estrogen receptor negative (ER- BrCa tumor cases from the Oncomine database. We performed a meta-analysis and identified genes that were universally up or down regulated with respect to ER+ versus ER- tumor status. We surveyed both the proximal promoter and 3' untranslated regions (3'UTR of our top-ranking genes in each expression group to test whether common sequence elements may contribute to the observed expression patterns. Utilizing a combination of known transcription factor binding sites (TFBS, evolutionarily conserved mammalian promoter and 3'UTR motifs, and microRNA (miRNA seed sequences, we identified numerous motifs that were disproportionately represented between the two gene classes suggesting a common regulatory network for the observed gene expression patterns. Conclusion Some of the genes we identified distinguish key transcripts previously seen in array studies, while others are newly defined. Many of the genes identified as overexpressed in ER- tumors were previously identified as expression markers for neoplastic transformation in multiple human cancers. Moreover, our motif analysis identified a collection of
Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

OpenAIRE

Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

2009-01-01

Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

Directory of Open Access Journals (Sweden)

Faridah Hani Mohamed Salleh

2017-01-01

Full Text Available Gene regulatory network (GRN reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C as a direct interaction (A → C. Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
Computational exploration of cis-regulatory modules in rhythmic expression data using the "Exploration of Distinctive CREs and CRMs" (EDCC) and "CRM Network Generator" (CNG) programs.

Science.gov (United States)

Bekiaris, Pavlos Stephanos; Tekath, Tobias; Staiger, Dorothee; Danisman, Selahattin

2018-01-01

Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, "Exploration of Distinctive CREs and CRMs" (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, "CRM Network Generator" (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression.
An approach to evaluate the topological significance of motifs and other patterns in regulatory networks

Directory of Open Access Journals (Sweden)

Wingender Edgar

2009-05-01

that enables to evaluate the topological significance of various connected patterns in a regulatory network. Applying this method onto transcriptional networks of three largely distinct organisms we could prove that it is highly suitable to identify most important pattern instances, but that neither motifs nor any pattern in general appear to play a particularly important role per se. From the results obtained so far, we conclude that the pairwise disconnectivity index will most likely prove useful as well in identifying other (higher-order pattern instances in transcriptional and other networks.
In silico Analysis of osr40c1 Promoter Sequence Isolated from Indica Variety Pokkali

Directory of Open Access Journals (Sweden)

W.S.I. de Silva

2017-07-01

Full Text Available The promoter region of a drought and abscisic acid (ABA inducible gene, osr40c1, was isolated from a salt-tolerant indica rice variety Pokkali, which is 670 bp upstream of the putative translation start codon. In silico promoter analysis of resulted sequence showed that at least 15 types of putative motifs were distributed within the sequence, including two types of common promoter elements, TATA and CAAT boxes. Additionally, several putative cis-acing regulatory elements which may be involved in regulation of osr40c1 expression under different conditions were found in the 5′-upstream region of osr40c1. These are ABA-responsive element, light-responsive elements (ATCT-motif, Box I, G-box, GT1-motif, Gap-box and Sp1, myeloblastosis oncogene response element (CCAAT-box, auxin responsive element (TGA-element, gibberellin-responsive element (GARE-motif and fungal-elicitor responsive elements (Box E and Box-W1. A putative regulatory element, required for endosperm-specific pattern of gene expression designated as Skn-1 motif, was also detected in the Pokkali osr40c1 promoter region. In conclusion, the bioinformatic analysis of osr40c1 promoter region isolated from indica rice variety Pokkali led to the identification of several important stress-responsive cis-acting regulatory elements, and therefore, the isolated promoter sequence could be employed in rice genetic transformation to mediate expression of abiotic stress induced genes.
5' Region of the human interleukin 4 gene: structure and potential regulatory elements

Energy Technology Data Exchange (ETDEWEB)

Eder, A; Krafft-Czepa, H; Krammer, P H

1988-01-25

The lymphokine Interleukin 4 (IL-4) is secreted by antigen or mitogen activated T lymphocytes. IL-4 stimulates activation and differentiation of B lymphocytes and growth of T lymphocytes and mast cells. The authors isolated the human IL-4 gene from a lambda EMBL3 genomic library. As a probe they used a synthetic oligonucleotide spanning position 40 to 79 of the published IL-4 cDNA sequence. The 5' promoter region contains several sequence elements which may have a cis-acting regulatory function for IL-4 gene expression. These elements include a TATA-box, three CCAAT-elements (two are on the non-coding strand) and an octamer motif. A comparison of the 5' flanking region of the human murine IL-4 gene (4) shows that the region between position -306 and +44 is highly conserved (83% homology).
Computational methods to dissect cis-regulatory transcriptional ...

Indian Academy of Sciences (India)

The formation of diverse cell types from an invariant set of genes is governed by biochemical and molecular processes that regulate gene activity. A complete understanding of the regulatory mechanisms of gene expression is the major function of genomics. Computational genomics is a rapidly emerging area for ...
Regulatory patterns of a large family of defensin-like genes expressed in nodules of Medicago truncatula.

Directory of Open Access Journals (Sweden)

Sumitha Nallu

Full Text Available Root nodules are the symbiotic organ of legumes that house nitrogen-fixing bacteria. Many genes are specifically induced in nodules during the interactions between the host plant and symbiotic rhizobia. Information regarding the regulation of expression for most of these genes is lacking. One of the largest gene families expressed in the nodules of the model legume Medicago truncatula is the nodule cysteine-rich (NCR group of defensin-like (DEFL genes. We used a custom Affymetrix microarray to catalog the expression changes of 566 NCRs at different stages of nodule development. Additionally, bacterial mutants were used to understand the importance of the rhizobial partners in induction of NCRs. Expression of early NCRs was detected during the initial infection of rhizobia in nodules and expression continued as nodules became mature. Late NCRs were induced concomitantly with bacteroid development in the nodules. The induction of early and late NCRs was correlated with the number and morphology of rhizobia in the nodule. Conserved 41 to 50 bp motifs identified in the upstream 1,000 bp promoter regions of NCRs were required for promoter activity. These cis-element motifs were found to be unique to the NCR family among all annotated genes in the M. truncatula genome, although they contain sub-regions with clear similarity to known regulatory motifs involved in nodule-specific expression and temporal gene regulation.
Transcriptome landscape of Lactococcus lactis reveals many novel RNAs including a small regulatory RNA involved in carbon uptake and metabolism.

Science.gov (United States)

van der Meulen, Sjoerd B; de Jong, Anne; Kok, Jan

2016-01-01

RNA sequencing has revolutionized genome-wide transcriptome analyses, and the identification of non-coding regulatory RNAs in bacteria has thus increased concurrently. Here we reveal the transcriptome map of the lactic acid bacterial paradigm Lactococcus lactis MG1363 by employing differential RNA sequencing (dRNA-seq) and a combination of manual and automated transcriptome mining. This resulted in a high-resolution genome annotation of L. lactis and the identification of 60 cis-encoded antisense RNAs (asRNAs), 186 trans-encoded putative regulatory RNAs (sRNAs) and 134 novel small ORFs. Based on the putative targets of asRNAs, a novel classification is proposed. Several transcription factor DNA binding motifs were identified in the promoter sequences of (a)sRNAs, providing insight in the interplay between lactococcal regulatory RNAs and transcription factors. The presence and lengths of 14 putative sRNAs were experimentally confirmed by differential Northern hybridization, including the abundant RNA 6S that is differentially expressed depending on the available carbon source. For another sRNA, LLMGnc_147, functional analysis revealed that it is involved in carbon uptake and metabolism. L. lactis contains 13% leaderless mRNAs (lmRNAs) that, from an analysis of overrepresentation in GO classes, seem predominantly involved in nucleotide metabolism and DNA/RNA binding. Moreover, an A-rich sequence motif immediately following the start codon was uncovered, which could provide novel insight in the translation of lmRNAs. Altogether, this first experimental genome-wide assessment of the transcriptome landscape of L. lactis and subsequent sRNA studies provide an extensive basis for the investigation of regulatory RNAs in L. lactis and related lactococcal species.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Xiangyun Xiao

Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.

A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Science.gov (United States)

Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

2015-01-01

The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

Science.gov (United States)

Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

2017-03-17

Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
MotifNet: a web-server for network motif analysis.

Science.gov (United States)

Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

2017-06-15

Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
PlantPAN: Plant promoter analysis navigator, for identifying combinatorial cis-regulatory elements with distance constraint in plant gene groups

Directory of Open Access Journals (Sweden)

Huang Hsien-Da

2008-11-01

Full Text Available Abstract Background The elucidation of transcriptional regulation in plant genes is important area of research for plant scientists, following the mapping of various plant genomes, such as A. thaliana, O. sativa and Z. mays. A variety of bioinformatic servers or databases of plant promoters have been established, although most have been focused only on annotating transcription factor binding sites in a single gene and have neglected some important regulatory elements (tandem repeats and CpG/CpNpG islands in promoter regions. Additionally, the combinatorial interaction of transcription factors (TFs is important in regulating the gene group that is associated with the same expression pattern. Therefore, a tool for detecting the co-regulation of transcription factors in a group of gene promoters is required. Results This study develops a database-assisted system, PlantPAN (Plant Promoter Analysis Navigator, for recognizing combinatorial cis-regulatory elements with a distance constraint in sets of plant genes. The system collects the plant transcription factor binding profiles from PLACE, TRANSFAC (public release 7.0, AGRIS, and JASPER databases and allows users to input a group of gene IDs or promoter sequences, enabling the co-occurrence of combinatorial transcription factor binding sites (TFBSs within a defined distance (20 bp to 200 bp to be identified. Furthermore, the new resource enables other regulatory features in a plant promoter, such as CpG/CpNpG islands and tandem repeats, to be displayed. The regulatory elements in the conserved regions of the promoters across homologous genes are detected and presented. Conclusion In addition to providing a user-friendly input/output interface, PlantPAN has numerous advantages in the analysis of a plant promoter. Several case studies have established the effectiveness of PlantPAN. This novel analytical resource is now freely available at http://PlantPAN.mbc.nctu.edu.tw.
A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium

Energy Technology Data Exchange (ETDEWEB)

Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.

2009-04-20

Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.
Toward understanding the evolution of vertebrate gene regulatory networks: comparative genomics and epigenomic approaches.

Science.gov (United States)

Martinez-Morales, Juan R

2016-07-01

Vertebrates, as most animal phyla, originated >500 million years ago during the Cambrian explosion, and progressively radiated into the extant classes. Inferring the evolutionary history of the group requires understanding the architecture of the developmental programs that constrain the vertebrate anatomy. Here, I review recent comparative genomic and epigenomic studies, based on ChIP-seq and chromatin accessibility, which focus on the identification of functionally equivalent cis-regulatory modules among species. This pioneer work, primarily centered in the mammalian lineage, has set the groundwork for further studies in representative vertebrate and chordate species. Mapping of active regulatory regions across lineages will shed new light on the evolutionary forces stabilizing ancestral developmental programs, as well as allowing their variation to sustain morphological adaptations on the inherited vertebrate body plan. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
The Verrucomicrobia LexA-binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

Directory of Open Access Journals (Sweden)

Ivan Erill

2016-07-01

Full Text Available The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.
The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response.

Science.gov (United States)

Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

2016-01-01

The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls.
A Survey of 6,300 Genomic Fragments for cis-Regulatory Activity in the Imaginal Discs of Drosophila melanogaster

Directory of Open Access Journals (Sweden)

Aurélie Jory

2012-10-01

Full Text Available Over 6,000 fragments from the genome of Drosophila melanogaster were analyzed for their ability to drive expression of GAL4 reporter genes in the third-instar larval imaginal discs. About 1,200 reporter genes drove expression in the eye, antenna, leg, wing, haltere, or genital imaginal discs. The patterns ranged from large regions to individual cells. About 75% of the active fragments drove expression in multiple discs; 20% were expressed in ventral, but not dorsal, discs (legs, genital, and antenna, whereas ∼23% were expressed in dorsal but not ventral discs (wing, haltere, and eye. Several patterns, for example, within the leg chordotonal organ, appeared a surprisingly large number of times. Unbiased searches for DNA sequence motifs suggest candidate transcription factors that may regulate enhancers with shared activities. Together, these expression patterns provide a valuable resource to the community and offer a broad overview of how transcriptional regulatory information is distributed in the Drosophila genome.
BayesMotif: de novo protein sorting motif discovery from impure datasets.

Science.gov (United States)

Hu, Jianjun; Zhang, Fan

2010-01-18

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of
Distinct configurations of protein complexes and biochemical pathways revealed by epistatic interaction network motifs

LENUS (Irish Health Repository)

Casey, Fergal

2011-08-22

Abstract Background Gene and protein interactions are commonly represented as networks, with the genes or proteins comprising the nodes and the relationship between them as edges. Motifs, or small local configurations of edges and nodes that arise repeatedly, can be used to simplify the interpretation of networks. Results We examined triplet motifs in a network of quantitative epistatic genetic relationships, and found a non-random distribution of particular motif classes. Individual motif classes were found to be associated with different functional properties, suggestive of an underlying biological significance. These associations were apparent not only for motif classes, but for individual positions within the motifs. As expected, NNN (all negative) motifs were strongly associated with previously reported genetic (i.e. synthetic lethal) interactions, while PPP (all positive) motifs were associated with protein complexes. The two other motif classes (NNP: a positive interaction spanned by two negative interactions, and NPP: a negative spanned by two positives) showed very distinct functional associations, with physical interactions dominating for the former but alternative enrichments, typical of biochemical pathways, dominating for the latter. Conclusion We present a model showing how NNP motifs can be used to recognize supportive relationships between protein complexes, while NPP motifs often identify opposing or regulatory behaviour between a gene and an associated pathway. The ability to use motifs to point toward underlying biological organizational themes is likely to be increasingly important as more extensive epistasis mapping projects in higher organisms begin.
A speedup technique for (l, d-motif finding algorithms

Directory of Open Access Journals (Sweden)

Dinh Hieu

2011-03-01

Full Text Available Abstract Background The discovery of patterns in DNA, RNA, and protein sequences has led to the solution of many vital biological problems. For instance, the identification of patterns in nucleic acid sequences has resulted in the determination of open reading frames, identification of promoter elements of genes, identification of intron/exon splicing sites, identification of SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have proven to be extremely helpful in domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, etc. Motifs are important patterns that are helpful in finding transcriptional regulatory elements, transcription factor binding sites, functional genomics, drug design, etc. As a result, numerous papers have been written to solve the motif search problem. Results Three versions of the motif search problem have been proposed in the literature: Simple Motif Search (SMS, (l, d-motif search (or Planted Motif Search (PMS, and Edit-distance-based Motif Search (EMS. In this paper we focus on PMS. Two kinds of algorithms can be found in the literature for solving the PMS problem: exact and approximate. An exact algorithm identifies the motifs always and an approximate algorithm may fail to identify some or all of the motifs. The exact version of PMS problem has been shown to be NP-hard. Exact algorithms proposed in the literature for PMS take time that is exponential in some of the underlying parameters. In this paper we propose a generic technique that can be used to speedup PMS algorithms. Conclusions We present a speedup technique that can be used on any PMS algorithm. We have tested our speedup technique on a number of algorithms. These experimental results show that our speedup technique is indeed very
Diverse activities of viral cis-acting RNA regulatory elements revealed using multicolor, long-term, single-cell imaging.

Science.gov (United States)

Pocock, Ginger M; Zimdars, Laraine L; Yuan, Ming; Eliceiri, Kevin W; Ahlquist, Paul; Sherer, Nathan M

2017-02-01

Cis-acting RNA structural elements govern crucial aspects of viral gene expression. How these structures and other posttranscriptional signals affect RNA trafficking and translation in the context of single cells is poorly understood. Herein we describe a multicolor, long-term (>24 h) imaging strategy for measuring integrated aspects of viral RNA regulatory control in individual cells. We apply this strategy to demonstrate differential mRNA trafficking behaviors governed by RNA elements derived from three retroviruses (HIV-1, murine leukemia virus, and Mason-Pfizer monkey virus), two hepadnaviruses (hepatitis B virus and woodchuck hepatitis virus), and an intron-retaining transcript encoded by the cellular NXF1 gene. Striking behaviors include "burst" RNA nuclear export dynamics regulated by HIV-1's Rev response element and the viral Rev protein; transient aggregations of RNAs into discrete foci at or near the nuclear membrane triggered by multiple elements; and a novel, pulsiform RNA export activity regulated by the hepadnaviral posttranscriptional regulatory element. We incorporate single-cell tracking and a data-mining algorithm into our approach to obtain RNA element-specific, high-resolution gene expression signatures. Together these imaging assays constitute a tractable, systems-based platform for studying otherwise difficult to access spatiotemporal features of viral and cellular gene regulation. © 2017 Pocock et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
The Regulatory Factor ZFHX3 Modifies Circadian Function in SCN via an AT Motif-Driven Axis

Science.gov (United States)

Parsons, Michael J.; Brancaccio, Marco; Sethi, Siddharth; Maywood, Elizabeth S.; Satija, Rahul; Edwards, Jessica K.; Jagannath, Aarti; Couch, Yvonne; Finelli, Mattéa J.; Smyllie, Nicola J.; Esapa, Christopher; Butler, Rachel; Barnard, Alun R.; Chesham, Johanna E.; Saito, Shoko; Joynson, Greg; Wells, Sara; Foster, Russell G.; Oliver, Peter L.; Simon, Michelle M.; Mallon, Ann-Marie; Hastings, Michael H.; Nolan, Patrick M.

2015-01-01

Summary We identified a dominant missense mutation in the SCN transcription factor Zfhx3, termed short circuit (Zfhx3Sci), which accelerates circadian locomotor rhythms in mice. ZFHX3 regulates transcription via direct interaction with predicted AT motifs in target genes. The mutant protein has a decreased ability to activate consensus AT motifs in vitro. Using RNA sequencing, we found minimal effects on core clock genes in Zfhx3Sci/+ SCN, whereas the expression of neuropeptides critical for SCN intercellular signaling was significantly disturbed. Moreover, mutant ZFHX3 had a decreased ability to activate AT motifs in the promoters of these neuropeptide genes. Lentiviral transduction of SCN slices showed that the ZFHX3-mediated activation of AT motifs is circadian, with decreased amplitude and robustness of these oscillations in Zfhx3Sci/+ SCN slices. In conclusion, by cloning Zfhx3Sci, we have uncovered a circadian transcriptional axis that determines the period and robustness of behavioral and SCN molecular rhythms. PMID:26232227
Murine homeobox-containing gene, Msx-1: analysis of genomic organization, promoter structure, and potential autoregulatory cis-acting elements.

Science.gov (United States)

Kuzuoka, M; Takahashi, T; Guron, C; Raghow, R

1994-05-01

Detailed molecular organization of the coding and upstream regulatory regions of the murine homeodomain-containing gene, Msx-1, is reported. The protein-encoding portion of the gene is contained in two exons, 590 and 1214 bp in length, separated by a 2107-bp intron; the homeodomain is located in the second exon. The two-exon organization of the murine Msx-1 gene resembles a number of other homeodomain-containing genes. The 5'-(GTAAGT) and 3'-(CCCTAG) splicing junctions and the mRNA polyadenylation signal (UAUAA) of the murine Msx-1 gene are also characteristic of other vertebrate genes. By nuclease protection and primer extension assays, the start of transcription of the Msx-1 gene was located 256 bp upstream of the first AUG. Computer analysis of the promoter proximal 1280-bp sequence revealed a number of potentially important cis-regulatory sequences; these include the recognition elements for Ap-1, Ap-2, Ap-3, Sp-1, a possible binding site for RAR:RXR, and a number of TCF-1 consensus motifs. Importantly, a perfect reverse complement of (C/G)TTAATTG, which was recently shown to be an optimal binding sequence for the homeodomain of Msx-1 protein (K.M. Catron, N. Iler, and C. Abate (1993) Mol. Cell. Biol. 13:2354-2365), was also located in the murine Msx-1 promoter. Binding of bacterially expressed Msx-1 homeodomain polypeptide to Msx-1-specific oligonucleotide was experimentally demonstrated, raising a distinct possibility of autoregulation of this developmentally regulated gene.
Transcriptional regulation by competing transcription factor modules.

Directory of Open Access Journals (Sweden)

Rutger Hermsen

2006-12-01

Full Text Available Gene regulatory networks lie at the heart of cellular computation. In these networks, intracellular and extracellular signals are integrated by transcription factors, which control the expression of transcription units by binding to cis-regulatory regions on the DNA. The designs of both eukaryotic and prokaryotic cis-regulatory regions are usually highly complex. They frequently consist of both repetitive and overlapping transcription factor binding sites. To unravel the design principles of these promoter architectures, we have designed in silico prokaryotic transcriptional logic gates with predefined input-output relations using an evolutionary algorithm. The resulting cis-regulatory designs are often composed of modules that consist of tandem arrays of binding sites to which the transcription factors bind cooperatively. Moreover, these modules often overlap with each other, leading to competition between them. Our analysis thus identifies a new signal integration motif that is based upon the interplay between intramodular cooperativity and intermodular competition. We show that this signal integration mechanism drastically enhances the capacity of cis-regulatory domains to integrate signals. Our results provide a possible explanation for the complexity of promoter architectures and could be used for the rational design of synthetic gene circuits.
Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells

KAUST Repository

Wong, Ka-Chun; Li, Yue; Peng, Chengbin

2015-01-01

Motivation: The protein-DNA interactions between transcription factors (TFs) and transcription factor binding sites (TFBSs, also known as DNA motifs) are critical activities in gene transcription. The identification of the DNA motifs is a vital task for downstream analysis. Unfortunately, the long-range coupling information between different DNA motifs is still lacking. To fill the void, as the first-of-its-kind study, we have identified the coupling DNA motif pairs on long-range chromatin interactions in human. Results: The coupling DNA motif pairs exhibit substantially higher DNase accessibility than the background sequences. Half of the DNA motifs involved are matched to the existing motif databases, although nearly all of them are enriched with at least one gene ontology term. Their motif instances are also found statistically enriched on the promoter and enhancer regions. Especially, we introduce a novel measurement called motif pairing multiplicity which is defined as the number of motifs that are paired with a given motif on chromatin interactions. Interestingly, we observe that motif pairing multiplicity is linked to several characteristics such as regulatory region type, motif sequence degeneracy, DNase accessibility and pairing genomic distance. Taken into account together, we believe the coupling DNA motif pairs identified in this study can shed lights on the gene transcription mechanism under long-range chromatin interactions. © The Author 2015. Published by Oxford University Press.
Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells

KAUST Repository

Wong, Ka-Chun

2015-09-27

Motivation: The protein-DNA interactions between transcription factors (TFs) and transcription factor binding sites (TFBSs, also known as DNA motifs) are critical activities in gene transcription. The identification of the DNA motifs is a vital task for downstream analysis. Unfortunately, the long-range coupling information between different DNA motifs is still lacking. To fill the void, as the first-of-its-kind study, we have identified the coupling DNA motif pairs on long-range chromatin interactions in human. Results: The coupling DNA motif pairs exhibit substantially higher DNase accessibility than the background sequences. Half of the DNA motifs involved are matched to the existing motif databases, although nearly all of them are enriched with at least one gene ontology term. Their motif instances are also found statistically enriched on the promoter and enhancer regions. Especially, we introduce a novel measurement called motif pairing multiplicity which is defined as the number of motifs that are paired with a given motif on chromatin interactions. Interestingly, we observe that motif pairing multiplicity is linked to several characteristics such as regulatory region type, motif sequence degeneracy, DNase accessibility and pairing genomic distance. Taken into account together, we believe the coupling DNA motif pairs identified in this study can shed lights on the gene transcription mechanism under long-range chromatin interactions. © The Author 2015. Published by Oxford University Press.
Viral infection upregulates myostatin promoter activity in orange-spotted grouper (Epinephelus coioides).

Science.gov (United States)

Chen, Yi-Tien; Lin, Chao-Fen; Chen, Young-Mao; Lo, Chih-En; Chen, Wan-Erh; Chen, Tzong-Yueh

2017-01-01

Myostatin is a negative regulator of myogenesis and has been suggested to be an important factor in the development of muscle wasting during viral infection. The objective of this study was to characterize the main regulatory element of the grouper myostatin promoter and to study changes in promoter activity due to viral stimulation. In vitro and in vivo experiments indicated that the E-box E6 is a positive cis-and trans-regulation motif, and an essential binding site for MyoD. In contrast, the E-box E5 is a dominant negative cis-regulatory. The characteristics of grouper myostatin promoter are similar in regulation of muscle growth to that of other species, but mainly through specific regulatory elements. According to these results, we conducted a study to investigate the effect of viral infection on myostatin promoter activity and its regulation. The nervous necrosis virus (NNV) treatment significantly induced myostatin promoter activity. The present study is the first report describing that specific myostatin motifs regulate promoter activity and response to viral infection.
Viral infection upregulates myostatin promoter activity in orange-spotted grouper (Epinephelus coioides.

Directory of Open Access Journals (Sweden)

Yi-Tien Chen

Full Text Available Myostatin is a negative regulator of myogenesis and has been suggested to be an important factor in the development of muscle wasting during viral infection. The objective of this study was to characterize the main regulatory element of the grouper myostatin promoter and to study changes in promoter activity due to viral stimulation. In vitro and in vivo experiments indicated that the E-box E6 is a positive cis-and trans-regulation motif, and an essential binding site for MyoD. In contrast, the E-box E5 is a dominant negative cis-regulatory. The characteristics of grouper myostatin promoter are similar in regulation of muscle growth to that of other species, but mainly through specific regulatory elements. According to these results, we conducted a study to investigate the effect of viral infection on myostatin promoter activity and its regulation. The nervous necrosis virus (NNV treatment significantly induced myostatin promoter activity. The present study is the first report describing that specific myostatin motifs regulate promoter activity and response to viral infection.

Information-Theoretic Inference of Large Transcriptional Regulatory Networks

Directory of Open Access Journals (Sweden)

Meyer Patrick

2007-01-01

Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Information-Theoretic Inference of Large Transcriptional Regulatory Networks

Directory of Open Access Journals (Sweden)

Patrick E. Meyer

2007-06-01

Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Evolution of cichlid vision via trans-regulatory divergence

Directory of Open Access Journals (Sweden)

O’Quin Kelly E

2012-12-01

Full Text Available Abstract Background Phenotypic evolution may occur through mutations that affect either the structure or expression of protein-coding genes. Although the evolution of color vision has historically been attributed to structural mutations within the opsin genes, recent research has shown that opsin regulatory mutations can also tune photoreceptor sensitivity and color vision. Visual sensitivity in African cichlid fishes varies as a result of the differential expression of seven opsin genes. We crossed cichlid species that express different opsin gene sets and scanned their genome for expression Quantitative Trait Loci (eQTL responsible for these differences. Our results shed light on the role that different structural, cis-, and trans-regulatory mutations play in the evolution of color vision. Results We identified 11 eQTL that contribute to the divergent expression of five opsin genes. On three linkage groups, several eQTL formed regulatory “hotspots” associated with the expression of multiple opsins. Importantly, however, the majority of the eQTL we identified (8/11 or 73% occur on linkage groups located trans to the opsin genes, suggesting that cichlid color vision has evolved primarily via trans-regulatory divergence. By modeling the impact of just two of these trans-regulatory eQTL, we show that opsin regulatory mutations can alter cichlid photoreceptor sensitivity and color vision at least as much as opsin structural mutations can. Conclusions Combined with previous work, we demonstrate that the evolution of cichlid color vision results from the interplay of structural, cis-, and especially trans-regulatory loci. Although there are numerous examples of structural and cis-regulatory mutations that contribute to phenotypic evolution, our results suggest that trans-regulatory mutations could contribute to phenotypic divergence more commonly than previously expected, especially in systems like color vision, where compensatory changes in the
Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

Directory of Open Access Journals (Sweden)

Alina Sîrbu

2015-05-01

Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.

Science.gov (United States)

Sîrbu, Alina; Crane, Martin; Ruskin, Heather J

2015-05-14

Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach

Directory of Open Access Journals (Sweden)

Buer Jan

2004-12-01

Full Text Available Abstract Background Cellular functions are coordinately carried out by groups of genes forming functional modules. Identifying such modules in the transcriptional regulatory network (TRN of organisms is important for understanding the structure and function of these fundamental cellular networks and essential for the emerging modular biology. So far, the global connectivity structure of TRN has not been well studied and consequently not applied for the identification of functional modules. Moreover, network motifs such as feed forward loop are recently proposed to be basic building blocks of TRN. However, their relationship to functional modules is not clear. Results In this work we proposed a top-down approach to identify modules in the TRN of E. coli. By studying the global connectivity structure of the regulatory network, we first revealed a five-layer hierarchical structure in which all the regulatory relationships are downward. Based on this regulatory hierarchy, we developed a new method to decompose the regulatory network into functional modules and to identify global regulators governing multiple modules. As a result, 10 global regulators and 39 modules were identified and shown to have well defined functions. We then investigated the distribution and composition of the two basic network motifs (feed forward loop and bi-fan motif in the hierarchical structure of TRN. We found that most of these network motifs include global regulators, indicating that these motifs are not basic building blocks of modules since modules should not contain global regulators. Conclusion The transcriptional regulatory network of E. coli possesses a multi-layer hierarchical modular structure without feedback regulation at transcription level. This hierarchical structure builds the basis for a new and simple decomposition method which is suitable for the identification of functional modules and global regulators in the transcriptional regulatory network of E
Eucalyptus ESTs involved in the production of 9-cis epoxycarotenoid dioxygenase, a regulatory enzyme of abscisic acid production

Directory of Open Access Journals (Sweden)

Iraê A. Guerrini

2005-01-01

Full Text Available Abscisic acid (ABA regulates stress responses in plants, and genomic tools can help us to understand the mechanisms involved in that process. FAPESP, a Brazilian research foundation, in association with four private forestry companies, has established the FORESTs database (https://forests.esalq.usp.br. A search was carried out in the Eucalyptus expressed sequence tag database to find ESTs involved with 9-cis epoxycarotenoid dioxygenase (NCED, the regulatory enzyme for ABA biosynthesis, using the basic local BLAST alignment tool. We found four clusters (EGEZLV2206B11.g, EGJMWD2252H08.g, EGBFRT3107F10.g, and EGEQFB1200H10.g, which represent similar sequences of the gene that produces NCED. Data showed that the EGBFRT3107F10.g cluster was similar to the maize (Zea mays NCED enzyme, while EGEZLV2206B11.g and EGJMWD2252H08.g clusters were similar to the avocado (Persea americana NCED enzyme. All Eucalyptus clusters were expressed in several tissues, especially in flower buds, where ABA has a special participation during the floral development process.
Genome-wide decoding of hierarchical modular structure of transcriptional regulation by cis-element and expression clustering.

Science.gov (United States)

Leyfer, Dmitriy; Weng, Zhiping

2005-09-01

A holistic approach to the study of cellular processes is identifying both gene-expression changes and regulatory elements promoting such changes. Cellular regulatory processes can be viewed as transcriptional modules (TMs), groups of coexpressed genes regulated by groups of transcription factors (TFs). We set out to devise a method that would identify TMs while avoiding arbitrary thresholds on TM sizes and number. Assuming that gene expression is determined by TFs that bind to the gene's promoter, clustering of genes based on TF binding sites (cis-elements) should create gene groups similar to those obtained by gene expression clustering. Intersections between the expression and cis-element-based gene clusters reveal TMs. Statistical significance assigned to each TM allows identification of regulatory units of any size. Our method correctly identifies the number and sizes of TMs on simulated datasets. We demonstrate that yeast experimental TMs are biologically relevant by comparing them with MIPS and GO categories. Our modules are in statistically significant agreement with TMs from other research groups. This work suggests that there is no preferential division of biological processes into regulatory units; each degree of partitioning exhibits a slice of biological network revealing hierarchical modular organization of transcriptional regulation.
Neutral forces acting on intragenomic variability shape the Escherichia coli regulatory network topology.

Science.gov (United States)

Ruths, Troy; Nakhleh, Luay

2013-05-07

Cis-regulatory networks (CRNs) play a central role in cellular decision making. Like every other biological system, CRNs undergo evolution, which shapes their properties by a combination of adaptive and nonadaptive evolutionary forces. Teasing apart these forces is an important step toward functional analyses of the different components of CRNs, designing regulatory perturbation experiments, and constructing synthetic networks. Although tests of neutrality and selection based on molecular sequence data exist, no such tests are currently available based on CRNs. In this work, we present a unique genotype model of CRNs that is grounded in a genomic context and demonstrate its use in identifying portions of the CRN with properties explainable by neutral evolutionary forces at the system, subsystem, and operon levels. We leverage our model against experimentally derived data from Escherichia coli. The results of this analysis show statistically significant and substantial neutral trends in properties previously identified as adaptive in origin--degree distribution, clustering coefficient, and motifs--within the E. coli CRN. Our model captures the tightly coupled genome-interactome of an organism and enables analyses of how evolutionary events acting at the genome level, such as mutation, and at the population level, such as genetic drift, give rise to neutral patterns that we can quantify in CRNs.
The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

Science.gov (United States)

Gaji, Rajshekhar Y; Howe, Daniel K

2009-07-01

The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Small RNAs and the regulation of cis-natural antisense transcripts in Arabidopsis

Directory of Open Access Journals (Sweden)

Lonardi Stefano

2008-01-01

mitochondrion-targeted proteins are over-represented in the Arabidopsis cis-NATs and that 19% of sense and antisense partner genes of cis-NATs share at least one common Gene Ontology term, which suggests that they encode proteins with possible functional connection. Conclusion The negatively correlated expression patterns of sense and antisense genes as well as the presence of siRNAs in many of the cis-NATs suggest that siRNA regulation of cis-NATs via the RNAi pathway is an important gene regulatory mechanism for at least a subgroup of cis-NATs in Arabidopsis.
Manipulation of EphB2 regulatory motifs and SH2 binding sites switches MAPK signaling and biological activity.

Science.gov (United States)

Tong, Jiefei; Elowe, Sabine; Nash, Piers; Pawson, Tony

2003-02-21

Signaling by the Eph family of receptor tyrosine kinases (RTKs) is complex, because they can interact with a variety of intracellular targets, and can potentially induce distinct responses in different cell types. In NG108 neuronal cells, activated EphB2 recruits p120RasGAP, in a fashion that is associated with down-regulation of the Ras-Erk mitogen-activated kinase (MAPK) pathway and neurite retraction. To pursue the role of the Ras-MAPK pathway in EphB2-mediated growth cone collapse, and to explore the biochemical and biological functions of Eph receptors, we sought to re-engineer the signaling properties of EphB2 by manipulating its regulatory motifs and SH2 binding sites. An EphB2 mutant that retained juxtamembrane (JM) RasGAP binding sites but incorporated a Grb2 binding motif at an alternate RasGAP binding site within the kinase domain had little effect on basal Erk MAPK activation. In contrast, elimination of all RasGAP binding sites, accompanied by the addition of a Grb2 binding site within the kinase domain, led to an increase in phospho-Erk levels in NG108 cells following ephrin-B1 stimulation. Functional assays indicated a correlation between neurite retraction and the ability of the EphB2 mutants to down-regulate Ras-Erk MAPK signaling. These data suggest that EphB2 can be designed to repress, stabilize, or activate the Ras-Erk MAPK pathway by the manipulation of RasGAP and Grb2 SH2 domain binding sites and support the notion that Erk MAPK regulation plays a significant role in axon guidance. The behavior of EphB2 variants with mutations in the JM region and kinase domains suggests an intricate pattern of regulation and target recognition by Eph receptors.
Ancient and recent positive selection transformed opioid cis-regulation in humans.

Directory of Open Access Journals (Sweden)

Matthew V Rockman

2005-12-01

Full Text Available Changes in the cis-regulation of neural genes likely contributed to the evolution of our species' unique attributes, but evidence of a role for natural selection has been lacking. We found that positive natural selection altered the cis-regulation of human prodynorphin, the precursor molecule for a suite of endogenous opioids and neuropeptides with critical roles in regulating perception, behavior, and memory. Independent lines of phylogenetic and population genetic evidence support a history of selective sweeps driving the evolution of the human prodynorphin promoter. In experimental assays of chimpanzee-human hybrid promoters, the selected sequence increases transcriptional inducibility. The evidence for a change in the response of the brain's natural opioids to inductive stimuli points to potential human-specific characteristics favored during evolution. In addition, the pattern of linked nucleotide and microsatellite variation among and within modern human populations suggests that recent selection, subsequent to the fixation of the human-specific mutations and the peopling of the globe, has favored different prodynorphin cis-regulatory alleles in different parts of the world.
Potential of the Antibody Against cis-Phosphorylated Tau in the Early Diagnosis, Treatment, and Prevention of Alzheimer Disease and Brain Injury.

Science.gov (United States)

Lu, Kun Ping; Kondo, Asami; Albayram, Onder; Herbert, Megan K; Liu, Hekun; Zhou, Xiao Zhen

2016-11-01

Alzheimer disease (AD) and chronic traumatic encephalopathy (CTE) share a common neuropathologic signature-neurofibrillary tangles made of phosphorylated tau-but do not have the same pathogenesis or symptoms. Although whether traumatic brain injury (TBI) could cause AD has not been established, CTE is shown to be associated with TBI. Until recently, whether and how TBI leads to tau-mediated neurodegeneration was unknown. The unique prolyl isomerase Pin1 protects against the development of tau-mediated neurodegeneration in AD by converting the phosphorylated Thr231-Pro motif in tau (ptau) from the pathogenic cis conformation to the physiologic trans conformation, thereby restoring ptau function. The recent development of antibodies able to distinguish and eliminate both conformations specifically has led to the discovery of cis-ptau as a precursor of tau-induced pathologic change and an early driver of neurodegeneration that directly links TBI to CTE and possibly to AD. Within hours of TBI in mice or neuronal stress in vitro, neurons prominently produce cis-ptau, which causes and spreads cis-ptau pathologic changes, termed cistauosis. Cistauosis eventually leads to widespread tau-mediated neurodegeneration and brain atrophy. Cistauosis is effectively blocked by the cis-ptau antibody, which targets intracellular cis-ptau for proteasome-mediated degradation and prevents extracellular cis-ptau from spreading to other neurons. Treating TBI mice with cis-ptau antibody not only blocks early cistauosis but also prevents development and spreading of tau-mediated neurodegeneration and brain atrophy and restores brain histopathologic features and functional outcomes. Thus, cistauosis is a common early disease mechanism for AD, TBI, and CTE, and cis-ptau and its antibody may be useful for early diagnosis, treatment, and prevention of these devastating diseases.
A Simple Decision Rule for Recognition of Poly(A) Tail Signal Motifs in Human Genome

KAUST Repository

AbouEisha, Hassan M.

2015-05-12

Background is the numerous attempts were made to predict motifs in genomic sequences that correspond to poly (A) tail signals. Vast portion of this effort has been directed to a plethora of nonlinear classification methods. Even when such approaches yield good discriminant results, identifying dominant features of regulatory mechanisms nevertheless remains a challenge. In this work, we look at decision rules that may help identifying such features. Findings are we present a simple decision rule for classification of candidate poly (A) tail signal motifs in human genomic sequence obtained by evaluating features during the construction of gradient boosted trees. We found that values of a single feature based on the frequency of adenine in the genomic sequence surrounding candidate signal and the number of consecutive adenine molecules in a well-defined region immediately following the motif displays good discriminative potential in classification of poly (A) tail motifs for samples covered by the rule. Conclusions is the resulting simple rule can be used as an efficient filter in construction of more complex poly(A) tail motifs classification algorithms.
Confidence intervals permit, but don't guarantee, better inference than statistical significance testing

Directory of Open Access Journals (Sweden)

Melissa Coulson

2010-07-01

Full Text Available A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST, or confidence intervals (CIs. Authors of articles published in psychology, behavioural neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST.
Does positive selection drive transcription factor binding site turnover? A test with Drosophila cis-regulatory modules.

Directory of Open Access Journals (Sweden)

Bin Z He

2011-04-01

Full Text Available Transcription factor binding site(s (TFBS gain and loss (i.e., turnover is a well-documented feature of cis-regulatory module (CRM evolution, yet little attention has been paid to the evolutionary force(s driving this turnover process. The predominant view, motivated by its widespread occurrence, emphasizes the importance of compensatory mutation and genetic drift. Positive selection, in contrast, although it has been invoked in specific instances of adaptive gene expression evolution, has not been considered as a general alternative to neutral compensatory evolution. In this study we evaluate the two hypotheses by analyzing patterns of single nucleotide polymorphism in the TFBS of well-characterized CRM in two closely related Drosophila species, Drosophila melanogaster and Drosophila simulans. An important feature of the analysis is classification of TFBS mutations according to the direction of their predicted effect on binding affinity, which allows gains and losses to be evaluated independently along the two phylogenetic lineages. The observed patterns of polymorphism and divergence are not compatible with neutral evolution for either class of mutations. Instead, multiple lines of evidence are consistent with contributions of positive selection to TFBS gain and loss as well as purifying selection in its maintenance. In discussion, we propose a model to reconcile the finding of selection driving TFBS turnover with constrained CRM function over long evolutionary time.
Functional architecture and global properties of the Corynebacterium glutamicum regulatory network: Novel insights from a dataset with a high genomic coverage.

Science.gov (United States)

Freyre-González, Julio A; Tauch, Andreas

2017-09-10

Corynebacterium glutamicum is a Gram-positive, anaerobic, rod-shaped soil bacterium able to grow on a diversity of carbon sources like sugars and organic acids. It is a biotechnological relevant organism because of its highly efficient ability to biosynthesize amino acids, such as l-glutamic acid and l-lysine. Here, we reconstructed the most complete C. glutamicum regulatory network to date and comprehensively analyzed its global organizational properties, systems-level features and functional architecture. Our analyses show the tremendous power of Abasy Atlas to study the functional organization of regulatory networks. We created two models of the C. glutamicum regulatory network: all-evidences (containing both weak and strong supported interactions, genomic coverage=73%) and strongly-supported (only accounting for strongly supported evidences, genomic coverage=71%). Using state-of-the-art methodologies, we prove that power-law behaviors truly govern the connectivity and clustering coefficient distributions. We found a non-previously reported circuit motif that we named complex feed-forward motif. We highlighted the importance of feedback loops for the functional architecture, beyond whether they are statistically over-represented or not in the network. We show that the previously reported top-down approach is inadequate to infer the hierarchy governing a regulatory network because feedback bridges different hierarchical layers, and the top-down approach disregards the presence of intermodular genes shaping the integration layer. Our findings all together further support a diamond-shaped, three-layered hierarchy exhibiting some feedback between processing and coordination layers, which is shaped by four classes of systems-level elements: global regulators, locally autonomous modules, basal machinery and intermodular genes. Copyright © 2016 Elsevier B.V. All rights reserved.
Discovery of a Regulatory Motif for Human Satellite DNA Transcription in Response to BATF2 Overexpression.

Science.gov (United States)

Bai, Xuejia; Huang, Wenqiu; Zhang, Chenguang; Niu, Jing; Ding, Wei

2016-03-01

One of the basic leucine zipper transcription factors, BATF2, has been found to suppress cancer growth and migration. However, little is known about the genes downstream of BATF2. HeLa cells were stably transfected with BATF2, then chromatin immunoprecipitation-sequencing was employed to identify the DNA motifs responsive to BATF2. Comprehensive bioinformatics analyses indicated that the most significant motif discovered as TTCCATT[CT]GATTCCATTC[AG]AT was primarily distributed among the chromosome centromere regions and mostly within human type II satellite DNA. Such motifs were able to prime the transcription of type II satellite DNA in a directional and asymmetrical manner. Consistently, satellite II transcription was up-regulated in BATF2-overexpressing cells. The present study provides insight into understanding the role of BATF2 in tumours and the importance of satellite DNA in the maintenance of genomic stability. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Feedback loops and reciprocal regulation: recurring motifs in the systems biology of the cell cycle

OpenAIRE

Ferrell, James E.

2013-01-01

The study of eukaryotic cell cycle regulation over the last several decades has led to a remarkably detailed understanding of the complex regulatory system that drives this fundamental process. This allows us to now look for recurring motifs in the regulatory system. Among these are negative feedback loops, which underpin checkpoints and generate cell cycle oscillations; positive feedback loops, which promote oscillations and make cell cycle transitions switch-like and unidirectional; and rec...

Cis-regulatory RNA elements that regulate specialized ribosome activity.

Science.gov (United States)

Xue, Shifeng; Barna, Maria

2015-01-01

Recent evidence has shown that the ribosome itself can play a highly regulatory role in the specialized translation of specific subpools of mRNAs, in particular at the level of ribosomal proteins (RP). However, the mechanism(s) by which this selection takes place has remained poorly understood. In our recent study, we discovered a combination of unique RNA elements in the 5'UTRs of mRNAs that allows for such control by the ribosome. These mRNAs contain a Translation Inhibitory Element (TIE) that inhibits general cap-dependent translation, and an Internal Ribosome Entry Site (IRES) that relies on a specific RP for activation. The unique combination of an inhibitor of general translation and an activator of specialized translation is key to ribosome-mediated control of gene expression. Here we discuss how these RNA regulatory elements provide a new level of control to protein expression and their implications for gene expression, organismal development and evolution.
Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

Science.gov (United States)

Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

2013-01-01

The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545
A Parzen window-based approach for the detection of locally enriched transcription factor binding sites.

Science.gov (United States)

Vandenbon, Alexis; Kumagai, Yutaro; Teraguchi, Shunsuke; Amada, Karlou Mar; Akira, Shizuo; Standley, Daron M

2013-01-21

Identification of cis- and trans-acting factors regulating gene expression remains an important problem in biology. Bioinformatics analyses of regulatory regions are hampered by several difficulties. One is that binding sites for regulatory proteins are often not significantly over-represented in the set of DNA sequences of interest, because of high levels of false positive predictions, and because of positional restrictions on functional binding sites with regard to the transcription start site. We have developed a novel method for the detection of regulatory motifs based on their local over-representation in sets of regulatory regions. The method makes use of a Parzen window-based approach for scoring local enrichment, and during evaluation of significance it takes into account GC content of sequences. We show that the accuracy of our method compares favourably to that of other methods, and that our method is capable of detecting not only generally over-represented regulatory motifs, but also locally over-represented motifs that are often missed by standard motif detection approaches. Using a number of examples we illustrate the validity of our approach and suggest applications, such as the analysis of weaker binding sites. Our approach can be used to suggest testable hypotheses for wet-lab experiments. It has potential for future analyses, such as the prediction of weaker binding sites. An online application of our approach, called LocaMo Finder (Local Motif Finder), is available at http://sysimm.ifrec.osaka-u.ac.jp/tfbs/locamo/.
Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins

Science.gov (United States)

Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles

2012-01-01

Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235
A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

Science.gov (United States)

Tran, Ngoc Tam L; Huang, Chun-Hsi

2014-02-20

ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.
The effects of incomplete protein interaction data on structural and evolutionary inferences

DEFF Research Database (Denmark)

de Silva, E; Thorne, T; Ingram, P

2006-01-01

of the inherent noise in protein interaction data. The effects of the incomplete nature of network data become very noticeable, especially for so-called network motifs. We also consider the effect of incomplete network data on functional and evolutionary inferences. Conclusion Crucially, when only small, partial...
Generation of Chimeric RNAs by cis-splicing of adjacent genes (cis-SAGe) in mammals.

Science.gov (United States)

Zhuo, Jian-Shu; Jing, Xiao-Yan; Du, Xin; Yang, Xiu-Qin

2018-02-20

Chimeric RNA molecules, possessing exons from two or more independent genes, are traditionally believed to be produced by chromosome rearrangement. However, recent studies revealed that cis-splicing of adjacent genes (cis- SAGe) is one of the major mechanisms underlying the formation of chimeric RNAs. cis-SAGe refers to intergenic splicing of directly adjacent genes with the same transcriptional orientation, resulting in read-through transcripts, termed chimeric RNAs, which contain sequences from two or more parental genes. cis-SAGe was first identified in tumor cells, since then its potential in carcinogenesis has attracted extensive attention. More and more scientists are focusing on it. With the development of research, cis-SAGe was found to be ubiquitous in various normal tissues, and might make a crucial contribution to the formation of novel genes in the evolution of genomes. In this review, we summarize the splicing pattern, expression characteristics, possible mechanisms, and significance of cis-SAGe in mammals. This review will be helpful for general understanding of the current status and development tendency of cis-SAGe.
Sequence-based model of gap gene regulatory network.

Science.gov (United States)

Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

2014-01-01

The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3
The nitrogen responsive transcriptome in potato (Solanum tuberosum L.) reveals significant gene regulatory motifs.

Science.gov (United States)

Gálvez, José Héctor; Tai, Helen H; Lagüe, Martin; Zebarth, Bernie J; Strömvik, Martina V

2016-05-19

Nitrogen (N) is the most important nutrient for the growth of potato (Solanum tuberosum L.). Foliar gene expression in potato plants with and without N supplementation at 180 kg N ha(-1) was compared at mid-season. Genes with consistent differences in foliar expression due to N supplementation over three cultivars and two developmental time points were examined. In total, thirty genes were found to be over-expressed and nine genes were found to be under-expressed with supplemented N. Functional relationships between over-expressed genes were found. The main metabolic pathway represented among differentially expressed genes was amino acid metabolism. The 1000 bp upstream flanking regions of the differentially expressed genes were analysed and nine overrepresented motifs were found using three motif discovery algorithms (Seeder, Weeder and MEME). These results point to coordinated gene regulation at the transcriptional level controlling steady state potato responses to N sufficiency.
Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection

Science.gov (United States)

Sun, Hong; Guns, Tias; Fierro, Ana Carolina; Thorrez, Lieven; Nijssen, Siegfried; Marchal, Kathleen

2012-01-01

Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the binding sites of the assayed transcription factor (TF) should be located, but also allows restricting the valid CRMs to those that contain the assayed TF (here referred to as applying CRM detection in a query-based mode). In this study, we show that exploiting ChIP-information in a query-based way makes in silico CRM detection a much more feasible endeavor. To be able to handle the large datasets, the query-based setting and other specificities proper to CRM detection on ChIP-Seq based data, we developed a novel powerful CRM detection method ‘CPModule’. By applying it on a well-studied ChIP-Seq data set involved in self-renewal of mouse embryonic stem cells, we demonstrate how our tool can recover combinatorial regulation of five known TFs that are key in the self-renewal of mouse embryonic stem cells. Additionally, we make a number of new predictions on combinatorial regulation of these five key TFs with other TFs documented in TRANSFAC. PMID:22422841
Assessment of network inference methods: how to cope with an underdetermined problem.

Directory of Open Access Journals (Sweden)

Caroline Siegenthaler

Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.
Motif analysis unveils the possible co-regulation of chloroplast genes and nuclear genes encoding chloroplast proteins.

Science.gov (United States)

Wang, Ying; Ding, Jun; Daniell, Henry; Hu, Haiyan; Li, Xiaoman

2012-09-01

Chloroplasts play critical roles in land plant cells. Despite their importance and the availability of at least 200 sequenced chloroplast genomes, the number of known DNA regulatory sequences in chloroplast genomes are limited. In this paper, we designed computational methods to systematically study putative DNA regulatory sequences in intergenic regions near chloroplast genes in seven plant species and in promoter sequences of nuclear genes in Arabidopsis and rice. We found that -35/-10 elements alone cannot explain the transcriptional regulation of chloroplast genes. We also concluded that there are unlikely motifs shared by intergenic sequences of most of chloroplast genes, indicating that these genes are regulated differently. Finally and surprisingly, we found five conserved motifs, each of which occurs in no more than six chloroplast intergenic sequences, are significantly shared by promoters of nuclear-genes encoding chloroplast proteins. By integrating information from gene function annotation, protein subcellular localization analyses, protein-protein interaction data, and gene expression data, we further showed support of the functionality of these conserved motifs. Our study implies the existence of unknown nuclear-encoded transcription factors that regulate both chloroplast genes and nuclear genes encoding chloroplast protein, which sheds light on the understanding of the transcriptional regulation of chloroplast genes.
Super-transient scaling in time-delay autonomous Boolean network motifs

Energy Technology Data Exchange (ETDEWEB)

D' Huys, Otti, E-mail: otti.dhuys@phy.duke.edu; Haynes, Nicholas D. [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Lohmann, Johannes [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Institut für Theoretische Physik, Technische Universität Berlin, Hardenbergstraße 36, 10623 Berlin (Germany); Gauthier, Daniel J. [Department of Physics, Duke University, Durham, North Carolina 27708 (United States); Department of Physics, The Ohio State University, Columbus, Ohio 43210 (United States)

2016-09-15

Autonomous Boolean networks are commonly used to model the dynamics of gene regulatory networks and allow for the prediction of stable dynamical attractors. However, most models do not account for time delays along the network links and noise, which are crucial features of real biological systems. Concentrating on two paradigmatic motifs, the toggle switch and the repressilator, we develop an experimental testbed that explicitly includes both inter-node time delays and noise using digital logic elements on field-programmable gate arrays. We observe transients that last millions to billions of characteristic time scales and scale exponentially with the amount of time delays between nodes, a phenomenon known as super-transient scaling. We develop a hybrid model that includes time delays along network links and allows for stochastic variation in the delays. Using this model, we explain the observed super-transient scaling of both motifs and recreate the experimentally measured transient distributions.
Epigenetic functions enriched in transcription factors binding to mouse recombination hotspots.

Science.gov (United States)

Wu, Min; Kwoh, Chee-Keong; Przytycka, Teresa M; Li, Jing; Zheng, Jie

2012-06-21

The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome-wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called "recombination hotspots". Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots; moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of histone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
Inference of Transcriptional Network for Pluripotency in Mouse Embryonic Stem Cells

International Nuclear Information System (INIS)

Aburatani, S

2015-01-01

In embryonic stem cells, various transcription factors (TFs) maintain pluripotency. To gain insights into the regulatory system controlling pluripotency, I inferred the regulatory relationships between the TFs expressed in ES cells. In this study, I applied a method based on structural equation modeling (SEM), combined with factor analysis, to 649 expression profiles of 19 TF genes measured in mouse Embryonic Stem Cells (ESCs). The factor analysis identified 19 TF genes that were regulated by several unmeasured factors. Since the known cell reprogramming TF genes (Pou5f1, Sox2 and Nanog) are regulated by different factors, each estimated factor is considered to be an input for signal transduction to control pluripotency in mouse ESCs. In the inferred network model, TF proteins were also arranged as unmeasured factors that control other TFs. The interpretation of the inferred network model revealed the regulatory mechanism for controlling pluripotency in ES cells
Strand-specific RNA-seq reveals widespread occurrence of novel cis-natural antisense transcripts in rice

Directory of Open Access Journals (Sweden)

Lu Tingting

2012-12-01

Full Text Available Abstract Background Cis-natural antisense transcripts (cis-NATs are RNAs transcribed from the antisense strand of a gene locus, and are complementary to the RNA transcribed from the sense strand. Common techniques including microarray approach and analysis of transcriptome databases are the major ways to globally identify cis-NATs in various eukaryotic organisms. Genome-wide in silico analysis has identified a large number of cis-NATs that may generate endogenous short interfering RNAs (nat-siRNAs, which participate in important biogenesis mechanisms for transcriptional and post-transcriptional regulation in rice. However, the transcriptomes are yet to be deeply sequenced to comprehensively investigate cis-NATs. Results We applied high-throughput strand-specific complementary DNA sequencing technology (ssRNA-seq to deeply sequence mRNA for assessing sense and antisense transcripts that were derived under salt, drought and cold stresses, and normal conditions, in the model plant rice (Oryza sativa. Combined with RAP-DB genome annotation (the Rice Annotation Project Database build-5 data set, 76,013 transcripts corresponding to 45,844 unique gene loci were assembled, in which 4873 gene loci were newly identified. Of 3819 putative rice cis-NATs, 2292 were detected as expressed and giving rise to small RNAs from their overlapping regions through integrated analysis of ssRNA-seq data and small RNA data. Among them, 503 cis-NATs seemed to be associated with specific conditions. The deep sequence data from isolated epidermal cells of rice seedlings further showed that 54.0% of cis-NATs were expressed simultaneously in a population of homogenous cells. Nearly 9.7% of rice transcripts were involved in one-to-one or many-to-many cis-NATs formation. Furthermore, only 17.4-34.7% of 223 many-to-many cis-NAT groups were all expressed and generated nat-siRNAs, indicating that only some cis-NAT groups may be involved in complex regulatory networks. Conclusions
Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets.

Science.gov (United States)

Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon

2012-01-01

To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Motif-role-fingerprints: the building-blocks of motifs, clustering-coefficients and transitivities in directed networks.

Directory of Open Access Journals (Sweden)

Mark D McDonnell

Full Text Available Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are 'structural' (induced subgraphs and 'functional' (partial subgraphs. Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.
Human apolipoprotein CIII gene expression is regulated by positive and negative cis-acting elements and tissue-specific protein factors

International Nuclear Information System (INIS)

Reue, K.; Leff, T.; Breslow, J.L.

1988-01-01

Apolipoprotein CIII (apoCIII) is a major protein constituent of triglyceride-rich lipoproteins and is synthesized primarily in the liver. Cis-acting DNA elements required for liver-specific apoCIII gene transcription were identified with transient expression assays in the human hepatoma (HepG2) and epithelial carcinoma (HeLa) cell lines. In liver cells, 821 nucleotides of the human apoCIII gene 5'-flanking sequence were required for maximum levels of gene expression, while the proximal 110 nucleotides alone were sufficient. No expression was observed in similar studies with HeLa cells. The level of expression was modulated by a combination of positive and negative cis-acting sequences, which interact with distinct sets of proteins from liver and HeLa cell nuclear extracts. The proximal positive regulatory region shares homology with similarly located sequences of other genes strongly expressed in the liver, including α 1 -antitrypsin and other apolipoprotein genes. The negative regulatory region is striking homologous to the human β-interferon gene regulatory element. The distal positive region shares homology with some viral enhancers and has properties of a tissue-specific enhancer. The regulation of the apoCIII gene is complex but shares features with other genes, suggesting shuffling of regulatory elements as a common mechanism for cell type-specific gene expression
Structural Diversity in Conserved Regions Like the DRY-Motif among Viral 7TM Receptors-A Consequence of Evolutionary Pressure?

DEFF Research Database (Denmark)

Mølleskov-Jensen, Ann-Sofie; Sparre-Ulrich, Alexander Hovard; Davis-Poynter, Nicholas

2012-01-01

Several herpes- and poxviruses have captured chemokine receptors from their hosts and modified these to their own benefit. The human and viral chemokine receptors belong to class A 7 transmembrane (TM) receptors which are characterized by several structural motifs like the DRY-motif in TM3...... and the C-terminal tail. In the DRY-motif, the arginine residue serves important purposes by being directly involved in G protein coupling. Interestingly, among the viral receptors there is a greater diversity in the DRY-motif compared to their endogenous receptor homologous. The C-terminal receptor tail...... constitutes another regulatory region that through a number of phosphorylation sites is involved in signaling, desensitization, and internalization. Also this region is more variable among virus-encoded 7TM receptors compared to human class A receptors. In this review we will focus on these two structural...

The lncRNA Malat1 Is Dispensable for Mouse Development but Its Transcription Plays a cis-Regulatory Role in the Adult

Directory of Open Access Journals (Sweden)

Bin Zhang

2012-07-01

Full Text Available Genome-wide studies have identified thousands of long noncoding RNAs (lncRNAs lacking protein-coding capacity. However, most lncRNAs are expressed at a very low level, and in most cases there is no genetic evidence to support their in vivo function. Malat1 (metastasis associated lung adenocarcinoma transcript 1 is among the most abundant and highly conserved lncRNAs, and it exhibits an uncommon 3′-end processing mechanism. In addition, its specific nuclear localization, developmental regulation, and dysregulation in cancer are suggestive of it having a critical biological function. We have characterized a Malat1 loss-of-function genetic model that indicates that Malat1 is not essential for mouse pre- and postnatal development. Furthermore, depletion of Malat1 does not affect global gene expression, splicing factor level and phosphorylation status, or alternative pre-mRNA splicing. However, among a small number of genes that were dysregulated in adult Malat1 knockout mice, many were Malat1 neighboring genes, thus indicating a potential cis-regulatory role of Malat1 gene transcription.
Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

Science.gov (United States)

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
qPMS7: a fast algorithm for finding (ℓ, d-motifs in DNA and protein sequences.

Directory of Open Access Journals (Sweden)

Hieu Dinh

Full Text Available Detection of rare events happening in a set of DNA/protein sequences could lead to new biological discoveries. One kind of such rare events is the presence of patterns called motifs in DNA/protein sequences. Finding motifs is a challenging problem since the general version of motif search has been proven to be intractable. Motifs discovery is an important problem in biology. For example, it is useful in the detection of transcription factor binding sites and transcriptional regulatory elements that are very crucial in understanding gene function, human disease, drug design, etc. Many versions of the motif search problem have been proposed in the literature. One such is the (ℓ, d-motif search (or Planted Motif Search (PMS. A generalized version of the PMS problem, namely, Quorum Planted Motif Search (qPMS, is shown to accurately model motifs in real data. However, solving the qPMS problem is an extremely difficult task because a special case of it, the PMS Problem, is already NP-hard, which means that any algorithm solving it can be expected to take exponential time in the worse case scenario. In this paper, we propose a novel algorithm named qPMS7 that tackles the qPMS problem on real data as well as challenging instances. Experimental results show that our Algorithm qPMS7 is on an average 5 times faster than the state-of-art algorithm. The executable program of Algorithm qPMS7 is freely available on the web at http://pms.engr.uconn.edu/downloads/qPMS7.zip. Our online motif discovery tools that use Algorithm qPMS7 are freely available at http://pms.engr.uconn.edu or http://motifsearch.com.
The KYxxL motif in Rad17 protein is essential for the interaction with the 9–1–1 complex

Energy Technology Data Exchange (ETDEWEB)

Fukumoto, Yasunori, E-mail: fukumoto@faculty.chiba-u.jp [Laboratory of Molecular Cell Biology, Graduate School of Pharmaceutical Sciences, Chiba University, Chiba 260-8675 (Japan); Ikeuchi, Masayoshi; Nakayama, Yuji [Department of Biochemistry & Molecular Biology, Kyoto Pharmaceutical University, Kyoto 607-8414 (Japan); Yamaguchi, Naoto, E-mail: nyama@faculty.chiba-u.jp [Laboratory of Molecular Cell Biology, Graduate School of Pharmaceutical Sciences, Chiba University, Chiba 260-8675 (Japan)

2016-09-02

ATR-dependent DNA damage checkpoint is the major DNA damage checkpoint against UV irradiation and DNA replication stress. The Rad17–RFC and Rad9–Rad1–Hus1 (9–1–1) complexes interact with each other to contribute to ATR signaling, however, the precise regulatory mechanism of the interaction has not been established. Here, we identified a conserved sequence motif, KYxxL, in the AAA+ domain of Rad17 protein, and demonstrated that this motif is essential for the interaction with the 9–1–1 complex. We also show that UV-induced Rad17 phosphorylation is increased in the Rad17 KYxxL mutants. These data indicate that the interaction with the 9–1–1 complex is not required for Rad17 protein to be an efficient substrate for the UV-induced phosphorylation. Our data also raise the possibility that the 9–1–1 complex plays a negative regulatory role in the Rad17 phosphorylation. We also show that the nucleotide-binding activity of Rad17 is required for its nuclear localization. - Highlights: • We have identified a conserved KYxxL motif in Rad17 protein. • The KYxxL motif is crucial for the interaction with the 9–1–1 complex. • The KYxxL motif is dispensable or inhibitory for UV-induced Rad17 phosphorylation. • Nucleotide binding of Rad17 is required for its nuclear localization.
Inferring time-varying network topologies from gene expression data.

Science.gov (United States)

Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas

2007-01-01

Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Ionizing radiation sources management in the Commonwealth of Independent States - CIS

International Nuclear Information System (INIS)

Iskra, A.; Bufetova, M.

2006-01-01

Ionizing radiation sources cover a broad band of power: from powerful NPP reactors and research reactors to portable radioisotope ionizing radiation sources applied in medicine, agriculture, industry and in the energy supply systems of remote facilities. At present, scales and use field of radionuclide sources in the CIS have the tendency to increase. In this connection, the issues of ionizing radiation sources management safety at all stages of their life cycle, from production to treatment, have been of a great importance. The materials on ionizing radiation sources inventory and treatment in the CIS (Russia, Armenia, Belarus, Georgia, Kazakhstan, Kyrgyzstan, Tajikistan and Ukraine) are presented in the report. It is shown that in some republics, there is difficulty in ionizing radiation sources accounting and control system; the national regulatory and legal framework bases regulating activity on radioactive sources use, localization and treatment require update. Many problems are connected with the sources beyond state accounting. The problem of ionizing radiation sources use safety is complicated by the growing activity of various terrorist groups. The opportunity to use ionizing radiation sources with terrorism goals requires the application of defined systems of security and physical protection at all stages of their management. For this purpose a collective, with all CIS countries, organization of radioactive sources accounting and control as well as countermeasures on their illegal transportation and use are necessary. In this connection, the information collection regarding situation with providing of ionizing radiation sources safety, conditions of equipment and storage facilities, radioactive materials accounting and control system in the CIS countries is vitally needed
Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

Energy Technology Data Exchange (ETDEWEB)

Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis; Rubin, EdwardM.; Couronne, Olivier

2005-06-13

Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A total of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.
Prediction of transcriptional regulatory elements for plant hormone responses based on microarray data

Directory of Open Access Journals (Sweden)

Yamaguchi-Shinozaki Kazuko

2011-02-01

Full Text Available Abstract Background Phytohormones organize plant development and environmental adaptation through cell-to-cell signal transduction, and their action involves transcriptional activation. Recent international efforts to establish and maintain public databases of Arabidopsis microarray data have enabled the utilization of this data in the analysis of various phytohormone responses, providing genome-wide identification of promoters targeted by phytohormones. Results We utilized such microarray data for prediction of cis-regulatory elements with an octamer-based approach. Our test prediction of a drought-responsive RD29A promoter with the aid of microarray data for response to drought, ABA and overexpression of DREB1A, a key regulator of cold and drought response, provided reasonable results that fit with the experimentally identified regulatory elements. With this succession, we expanded the prediction to various phytohormone responses, including those for abscisic acid, auxin, cytokinin, ethylene, brassinosteroid, jasmonic acid, and salicylic acid, as well as for hydrogen peroxide, drought and DREB1A overexpression. Totally 622 promoters that are activated by phytohormones were subjected to the prediction. In addition, we have assigned putative functions to 53 octamers of the Regulatory Element Group (REG that have been extracted as position-dependent cis-regulatory elements with the aid of their feature of preferential appearance in the promoter region. Conclusions Our prediction of Arabidopsis cis-regulatory elements for phytohormone responses provides guidance for experimental analysis of promoters to reveal the basis of the transcriptional network of phytohormone responses.
The cis-regulatory element CCACGTGG is involved in ABA and water-stress responses of the maize gene rab28.

Science.gov (United States)

Pla, M; Vilardell, J; Guiltinan, M J; Marcotte, W R; Niogret, M F; Quatrano, R S; Pagès, M

1993-01-01

The maize gene rab28 has been identified as ABA-inducible in embryos and vegetative tissues. It is also induced by water stress in young leaves. The proximal promoter region contains the conserved cis-acting element CCACGTGG (ABRE) reported for ABA induction in other plant genes. Transient expression assays in rice protoplasts indicate that a 134 bp fragment (-194 to -60 containing the ABRE) fused to a truncated cauliflower mosaic virus promoter (35S) is sufficient to confer ABA-responsiveness upon the GUS reporter gene. Gel retardation experiments indicate that nuclear proteins from tissues in which the rab28 gene is expressed can interact specifically with this 134 bp DNA fragment. Nuclear protein extracts from embryo and water-stressed leaves generate specific complexes of different electrophoretic mobility which are stable in the presence of detergent and high salt. However, by DMS footprinting the same guanine-specific contacts with the ABRE in both the embryo and leaf binding activities were detected. These results indicate that the rab28 promoter sequence CCACGTGG is a functional ABA-responsive element, and suggest that distinct regulatory factors with apparent similar affinity for the ABRE sequence may be involved in the hormone action during embryo development and in vegetative tissues subjected to osmotic stress.
Exposure of maternal mice to cis-bifenthrin enantioselectively disrupts the transcription of genes related to testosterone synthesis in male offspring.

Science.gov (United States)

Jin, Yuanxiang; Wang, Jiangcong; Sun, Xueqing; Ye, Yang; Xu, Minjie; Wang, Jianai; Chen, Shaoping; Fu, Zhengwei

2013-12-01

The commercial bifenthrin (BF) contains two cis isomers. In the present study, a dose of 15mg/kg of 1R-cis-BF or 1S-cis-BF was orally administered for 3 weeks to female mice before or during pregnancy. Then, the expression of steroidogenesis related genes which were considered as effective biomarkers of endocrine disruption were analyzed in the male offspring. Maternal exposure to 1S-cis-BF during pregnancy significantly reduced the mRNA levels of peripheral benzodiazepine receptor (PBR) and steroidogenic acute regulatory protein (StAR) in the testes of 3- or 6-week old male offspring. In addition, a significant decrease of cytochrome P450 17α-hydroxysteroid dehydrogenase (P450-17α) was also observed in the testes of 6-week old male offspring when dams were treated with 1S-cis-BF during pregnancy but not before pregnancy. Moreover, the scavenger receptor class B type 1 (SRB1) and cytochrome P450 cholesterol side-chain cleavage enzyme (P450scc) decreased significantly in the testes of 6-week old male offspring when dams were treated with 1S-cis-BF during and before pregnancy. Thus, oral administration of the maternal mice to cis-BF for 3 weeks, particularly during pregnancy, resulted in endocrine disruption in the male offspring, with the 1S-cis-BF causing more significant alterations than the 1R-cis-BF form. Copyright © 2013 Elsevier Inc. All rights reserved.
Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

Energy Technology Data Exchange (ETDEWEB)

Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

2006-06-01

Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.
NF-Y loss triggers p53 stabilization and apoptosis in HPV18-positive cells by affecting E6 transcription

OpenAIRE

Benatti, Paolo; Basile, Valentina; Dolfini, Diletta; Belluti, Silvia; Tomei, Margherita; Imbriano, Carol

2016-01-01

The expression of the high risk HPV18 E6 and E7 oncogenic proteins induces the transformation of epithelial cells, through the disruption of p53 and Rb function. The binding of cellular transcription factors to cis-regulatory elements in the viral Upstream Regulatory Region (URR) stimulates E6/E7 transcription. Here, we demonstrate that the CCAAT-transcription factor NF-Y binds to a non-canonical motif within the URR and activates viral gene expression. In addition, NF-Y indirectly up-regulat...
Identification of choriogenin cis-regulatory elements and production of estrogen-inducible, liver-specific transgenic Medaka.

Science.gov (United States)

Ueno, Tetsuro; Yasumasu, Shigeki; Hayashi, Shinji; Iuchi, Ichiro

2004-07-01

Choriogenins (chg-H, chg-L) are precursor proteins of egg envelope of medaka and synthesized in the spawning female liver in response to estrogen. We linked a gene construct chg-L1.5 kb/GFP (a 1.5 kb 5'-upstream region of the chg-L gene fused with a green fluorescence protein (GFP) gene) to another construct emgb/RFP (a cis-regulatory region of embryonic globin gene fused with an RFP gene), injected the double fusion gene construct into 1- or 2-cell-stage embryos, and selected embryos expressing the RFP in erythroid cells. From the embryos, we established two lines of chg-L1.5 kb/GFP-emgb/RFP-transgenic medaka. The 3-month-old spawning females and estradiol-17beta (E2)-exposed males displayed the liver-specific GFP expression. The E2-dependent GFP expression was detected in the differentiating liver of the stage 37-38 embryos. In addition, RT-PCR and whole-mount in situ hybridization showed that the E2-dependent chg expression was found in the liver of the stage 34 embryos of wild medaka, suggesting that such E2-dependency is achieved shortly after differentiation of the liver. Analysis using serial deletion mutants fused with GFP showed that the region -426 to -284 of the chg-L gene or the region -364 to -265 of the chg-H gene had the ability to promote the E2-dependent liver-specific GFP expression of its downstream gene. Further analyses suggested that an estrogen response element (ERE) at -309, an ERE half-site at -330 and a binding site for C/EBP at -363 of the chg-L gene played important roles in its downstream chg-L gene expression. In addition, this transgenic medaka may be useful as one of the test animals for detecting environmental estrogenic steroids.
Poly(A) motif prediction using spectral latent features from human DNA sequences

KAUST Repository

Xie, Bo; Jankovic, Boris R.; Bajic, Vladimir B.; Song, Le; Gao, Xin

2013-01-01

Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA.Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge.Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance.We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ?30% fewer error predictions relative to the other
Poly(A) motif prediction using spectral latent features from human DNA sequences

KAUST Repository

Xie, Bo

2013-06-21

Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA.Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge.Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance.We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ?30% fewer error predictions relative to the other
Probabilistic generation of random networks taking into account information on motifs occurrence.

Science.gov (United States)

Bois, Frederic Y; Gayraud, Ghislaine

2015-01-01

Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli.
A CACGTG motif of the Antirrhinum majus chalcone synthase promoter is recognized by an evolutionarily conserved nuclear protein

International Nuclear Information System (INIS)

Staiger, D.; Kaulen, H.; Schell, J.

1989-01-01

In the chalcone synthase gene of Antirrhinum majus (snapdragon), 150 base pairs of the 5' flanking region contain cis-acting signals for UV light-induced expression. A nuclear factor, designated CG-1, specifically recognizes a hexameric motif with internal dyad symmetry, CACGTG, located within this light-responsive sequence. Binding of CG-1 is influenced by C-methylation of the CpG dinucleotide in the recognition sequence. CG-1 is a factor found in a variety of dicotyledonous plant species including Nicotiana tabacum, A. majus, Petunia hybrida, Arabidopsis thaliana, and Glycine max. CACGTG motifs contained within trans-acting factor recognition sites in various other plant promoters can interact with CG-1. In addition, the binding site of the human adenovirus major late transcription factor USF can compete for CG-1 binding to the chalcone synthase promoter. This suggests an evolutionary conservation of trans-acting factor recognition sites involved in divergent mechanisms of gene control. (author)
Crystallization and preliminary X-ray diffraction analysis of motif N from Saccharomyces cerevisiae Dbf4

International Nuclear Information System (INIS)

Matthews, Lindsay A.; Duong, Andrew; Prasad, Ajai A.; Duncker, Bernard P.; Guarné, Alba

2009-01-01

To understand the role of the Cdc7–Dbf4 complex in checkpoint responses, a fragment of Saccharomyces cerevisiae Dbf4 encompassing motif N was isolated, overproduced and crystallized. The Cdc7–Dbf4 complex plays an instrumental role in the initiation of DNA replication and is a target of replication-checkpoint responses in Saccharomyces cerevisiae. Cdc7 is a conserved serine/threonine kinase whose activity depends on association with its regulatory subunit, Dbf4. A conserved sequence near the N-terminus of Dbf4 (motif N) is necessary for the interaction of Cdc7–Dbf4 with the checkpoint kinase Rad53. To understand the role of the Cdc7–Dbf4 complex in checkpoint responses, a fragment of Saccharomyces cerevisiae Dbf4 encompassing motif N was isolated, overproduced and crystallized. A complete native data set was collected at 100 K from crystals that diffracted X-rays to 2.75 Å resolution and structure determination is currently under way
An essential GT motif in the lamin A promoter mediates activation by CREB-binding protein

International Nuclear Information System (INIS)

Janaki Ramaiah, M.; Parnaik, Veena K.

2006-01-01

Lamin A is an important component of nuclear architecture in mammalian cells. Mutations in the human lamin A gene lead to highly degenerative disorders that affect specific tissues. In studies directed towards understanding the mode of regulation of the lamin A promoter, we have identified an essential GT motif at -55 position by reporter gene assays and mutational analysis. Binding of this sequence to Sp transcription factors has been observed in electrophoretic mobility shift assays and by chromatin immunoprecipitation studies. Further functional analysis by co-expression of recombinant proteins and ChIP assays has shown an important regulatory role for CREB-binding protein in promoter activation, which is mediated by the GT motif
Cis-acting elements in the promoter region of the human aldolase C gene.

Science.gov (United States)

Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

1993-08-16

We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.

Shared regulatory sites are abundant in the human genome and shed light on genome evolution and disease pleiotropy.

Science.gov (United States)

Tong, Pin; Monahan, Jack; Prendergast, James G D

2017-03-01

Large-scale gene expression datasets are providing an increasing understanding of the location of cis-eQTLs in the human genome and their role in disease. However, little is currently known regarding the extent of regulatory site-sharing between genes. This is despite it having potentially wide-ranging implications, from the determination of the way in which genetic variants may shape multiple phenotypes to the understanding of the evolution of human gene order. By first identifying the location of non-redundant cis-eQTLs, we show that regulatory site-sharing is a relatively common phenomenon in the human genome, with over 10% of non-redundant regulatory variants linked to the expression of multiple nearby genes. We show that these shared, local regulatory sites are linked to high levels of chromatin looping between the regulatory sites and their associated genes. In addition, these co-regulated gene modules are found to be strongly conserved across mammalian species, suggesting that shared regulatory sites have played an important role in shaping human gene order. The association of these shared cis-eQTLs with multiple genes means they also appear to be unusually important in understanding the genetics of human phenotypes and pleiotropy, with shared regulatory sites more often linked to multiple human phenotypes than other regulatory variants. This study shows that regulatory site-sharing is likely an underappreciated aspect of gene regulation and has important implications for the understanding of various biological phenomena, including how the two and three dimensional structures of the genome have been shaped and the potential causes of disease pleiotropy outside coding regions.
Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Takeshi Hase

Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.
A novel mutual information-based Boolean network inference method from time-series gene expression data.

Directory of Open Access Journals (Sweden)

Shohag Barman

Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.
Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis.

Science.gov (United States)

Yang, Fan; Wang, Jiebiao; Pierce, Brandon L; Chen, Lin S

2017-11-01

The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes ( cis -eQTLs). More research is needed to identify effects of genetic variation on distant genes ( trans -eQTLs) and understand their biological mechanisms. One common trans -eQTLs mechanism is "mediation" by a local ( cis ) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are " cis -mediators" of trans -eQTLs, including those " cis -hubs" involved in regulation of many trans -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis -hubs and trans -eQTL regulation across tissue types. © 2017 Yang et al.; Published by Cold Spring Harbor Laboratory Press.
The MHC motif viewer: a visualization tool for MHC binding motifs

DEFF Research Database (Denmark)

Rapin, Nicolas; Hoof, Ilka; Lund, Ole

2010-01-01

is hampered by the lack of tools for browsing and comparing specificity of these molecules. We have developed a Web server, MHC Motif Viewer, which allows the display of the binding motif for MHC class I proteins for human, chimpanzee, rhesus monkey, mouse, and swine, as well as HLA-DR protein sequences...
cis sequence effects on gene expression

Directory of Open Access Journals (Sweden)

Jacobs Kevin

2007-08-01

Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.
iELM—a web server to explore short linear motif-mediated interactions

Science.gov (United States)

Weatheritt, Robert J.; Jehl, Peter; Dinkel, Holger; Gibson, Toby J.

2012-01-01

The recent expansion in our knowledge of protein–protein interactions (PPIs) has allowed the annotation and prediction of hundreds of thousands of interactions. However, the function of many of these interactions remains elusive. The interactions of Eukaryotic Linear Motif (iELM) web server provides a resource for predicting the function and positional interface for a subset of interactions mediated by short linear motifs (SLiMs). The iELM prediction algorithm is based on the annotated SLiM classes from the Eukaryotic Linear Motif (ELM) resource and allows users to explore both annotated and user-generated PPI networks for SLiM-mediated interactions. By incorporating the annotated information from the ELM resource, iELM provides functional details of PPIs. This can be used in proteomic analysis, for example, to infer whether an interaction promotes complex formation or degradation. Furthermore, details of the molecular interface of the SLiM-mediated interactions are also predicted. This information is displayed in a fully searchable table, as well as graphically with the modular architecture of the participating proteins extracted from the UniProt and Phospho.ELM resources. A network figure is also presented to aid the interpretation of results. The iELM server supports single protein queries as well as large-scale proteomic submissions and is freely available at http://i.elm.eu.org. PMID:22638578
Kopi dan Kakao dalam Kreasi Motif Batik Khas Jember

Directory of Open Access Journals (Sweden)

Irfa'ina Rohana Salma

2015-06-01

Full Text Available ABSTRAK Batik Jember selama ini identik dengan motif daun tembakau. Visualisasi daun tembakau dalam motif Batik Jember cukup lemah, yaitu kurang berkarakter karena motif yang muncul adalah seperti gambar daun pada umumnya. Oleh karena itu perlu diciptakan desain motif batik khas Jember yang sumber inspirasinya digali dari kekayaan alam lainnya dari Jember yang mempunyai bentuk spesifik dan karakteristik sehingga identitas motif bisa didapatkan dengan lebih kuat. Hasil alam khas Jember tersebut adalah kopi dan kakao. Tujuan penciptaan seni ini adalah untuk menghasilkan motif batik baru yang mempunyai ciri khas Jember. Metode yang digunakan yaitu pengumpulan data, pengamatan mendalam terhadap objek penciptaan, pengkajian sumber inspirasi, pembuatan desain motif, dan perwujudan menjadi batik. Dari penciptaan seni ini berhasil dikreasikan 6 (enam motif batik yaitu: (1 Motif Uwoh Kopi; (2 Motif Godong Kopi; (3 Motif Ceplok Kakao; (4 Motif Kakao Raja; (5 Motif Kakao Biru; dan (6 Motif Wiji Mukti. Berdasarkan hasil penilaian “Selera Estetika” diketahui bahwa motif yang paling banyak disukai adalah Motif Uwoh Kopi dan Motif Kakao Raja. Kata kunci: Motif Woh Kopi, Motif Godong Kopi, Motif Ceplok Kakao, Motif Kakao Raja, Motif Kakao Biru, Motif Wiji Mukti ABSTRACTBatik Jember is synonymous with tobacco leaf motif. Tobacco leaf shape is quite weak in the visual appearance characterized as that motif emerges like a picture of leaves in general. Therefore, it is necessary to create a distinctive design motif extracted from other natural resources of Jember that have specific shapes and characteristics that can be obtained as the stronger motif identity. The typical natural resources from Jember are coffee and cocoa. The purpose of the creation of this art is to produce the unique, creative and innovative batik and have specific characteristics of Jember. The method used are data collection, observation of the object, reviewing inspiration sources
Robustness and backbone motif of a cancer network regulated by miR-17-92 cluster during the G1/S transition.

Directory of Open Access Journals (Sweden)

Lijian Yang

Full Text Available Based on interactions among transcription factors, oncogenes, tumor suppressors and microRNAs, a Boolean model of cancer network regulated by miR-17-92 cluster is constructed, and the network is associated with the control of G1/S transition in the mammalian cell cycle. The robustness properties of this regulatory network are investigated by virtue of the Boolean network theory. It is found that, during G1/S transition in the cell cycle process, the regulatory networks are robustly constructed, and the robustness property is largely preserved with respect to small perturbations to the network. By using the unique process-based approach, the structure of this network is analyzed. It is shown that the network can be decomposed into a backbone motif which provides the main biological functions, and a remaining motif which makes the regulatory system more stable. The critical role of miR-17-92 in suppressing the G1/S cell cycle checkpoint and increasing the uncontrolled proliferation of the cancer cells by targeting a genetic network of interacting proteins is displayed with our model.
Structure of the central RNA recognition motif of human TIA-1 at 1.95 A resolution

International Nuclear Information System (INIS)

Kumar, Amit O.; Swenson, Matthew C.; Benning, Matthew M.; Kielkopf, Clara L.

2008-01-01

T-cell-restricted intracellular antigen-1 (TIA-1) regulates alternative pre-mRNA splicing in the nucleus, and mRNA translation in the cytoplasm, by recognizing uridine-rich sequences of RNAs. As a step towards understanding RNA recognition by this regulatory factor, the X-ray structure of the central RNA recognition motif (RRM2) of human TIA-1 is presented at 1.95 A resolution. Comparison with structurally homologous RRM-RNA complexes identifies residues at the RNA interfaces that are conserved in TIA-1-RRM2. The versatile capability of RNP motifs to interact with either proteins or RNA is reinforced by symmetry-related protein-protein interactions mediated by the RNP motifs of TIA-1-RRM2. Importantly, the TIA-1-RRM2 structure reveals the locations of mutations responsible for inhibiting nuclear import. In contrast with previous assumptions, the mutated residues are buried within the hydrophobic interior of the domain, where they would be likely to destabilize the RRM fold rather than directly inhibit RNA binding
Finding cis-regulatory modules in Drosophila using phylogenetic hidden Markov models

DEFF Research Database (Denmark)

Wong, Wendy S W; Nielsen, Rasmus

2007-01-01

MOTIVATION: Finding the regulatory modules for transcription factors binding is an important step in elucidating the complex molecular mechanisms underlying regulation of gene expression. There are numerous methods available for solving this problem, however, very few of them take advantage of th...
Generic Properties of Random Gene Regulatory Networks.

Science.gov (United States)

Li, Zhiyuan; Bianco, Simone; Zhang, Zhaoyang; Tang, Chao

2013-12-01

Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.
A novel gene network inference algorithm using predictive minimum description length approach.

Science.gov (United States)

Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

2010-05-28

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the
CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

Science.gov (United States)

Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

2014-12-01

Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.
MSDmotif: exploring protein sites and motifs

Directory of Open Access Journals (Sweden)

Henrick Kim

2008-07-01

Full Text Available Abstract Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures.
REDfly: a Regulatory Element Database for Drosophila.

Science.gov (United States)

Gallo, Steven M; Li, Long; Hu, Zihua; Halfon, Marc S

2006-02-01

Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution.
Impact of noise on molecular network inference.

Directory of Open Access Journals (Sweden)

Radhakrishnan Nagarajan

Full Text Available Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a common-effect, (b three-chain, and (c coherent type-I feed-forward loop (FFL are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS and Incremental Association Markov Blanket (IAMB that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the
Cis-Lunar Base Camp

Science.gov (United States)

Merrill, Raymond G.; Goodliff, Kandyce E.; Mazanek, Daniel D.; Reeves, John D., Jr.

2012-01-01

Historically, when mounting expeditions into uncharted territories, explorers have established strategically positioned base camps to pre-position required equipment and consumables. These base camps are secure, safe positions from which expeditions can depart when conditions are favorable, at which technology and operations can be tested and validated, and facilitate timely access to more robust facilities in the event of an emergency. For human exploration missions into deep space, cis-lunar space is well suited to serve as such a base camp. The outer regions of cis-lunar space, such as the Earth-Moon Lagrange points, lie near the edge of Earth s gravity well, allowing equipment and consumables to be aggregated with easy access to deep space and to the lunar surface, as well as more distant destinations, such as near-Earth Asteroids (NEAs) and Mars and its moons. Several approaches to utilizing a cis-lunar base camp for sustainable human exploration, as well as some possible future applications are identified. The primary objective of the analysis presented in this paper is to identify options, show the macro trends, and provide information that can be used as a basis for more detailed mission development. Compared within are the high-level performance and cost of 15 preliminary cis-lunar exploration campaigns that establish the capability to conduct crewed missions of up to one year in duration, and then aggregate mass in cis-lunar space to facilitate an expedition from Cis-Lunar Base Camp. Launch vehicles, chemical propulsion stages, and electric propulsion stages are discussed and parametric sizing values are used to create architectures of in-space transportation elements that extend the existing in-space supply chain to cis-lunar space. The transportation options to cis-lunar space assessed vary in efficiency by almost 50%; from 0.16 to 0.68 kg of cargo in cis-lunar space for every kilogram of mass in Low Earth Orbit (LEO). For the 15 cases, 5-year campaign
Characterization of the Promoter Region of an Arabidopsis Gene for 9-cis-Epoxycarotenoid Dioxygenase Involved in Dehydration-Inducible Transcription

Science.gov (United States)

Behnam, Babak; Iuchi, Satoshi; Fujita, Miki; Fujita, Yasunari; Takasaki, Hironori; Osakabe, Yuriko; Yamaguchi-Shinozaki, Kazuko; Kobayashi, Masatomo; Shinozaki, Kazuo

2013-01-01

Plants respond to dehydration stress and tolerate water-deficit status through complex physiological and cellular processes. Many genes are induced by water deficit. Abscisic acid (ABA) plays important roles in tolerance to dehydration stress by inducing many stress genes. ABA is synthesized de novo in response to dehydration. Most of the genes involved in ABA biosynthesis have been identified, and they are expressed mainly in leaf vascular tissues. Of the products of such genes, 9-cis-epoxycarotenoid dioxygenase (NCED) is a key enzyme in ABA biosynthesis. One of the five NCED genes in Arabidopsis, AtNCED3, is significantly induced by dehydration. To understand the regulatory mechanism of the early stages of the dehydration stress response, it is important to analyse the transcriptional regulatory systems of AtNCED3. In the present study, we found that an overlapping G-box recognition sequence (5′-CACGTG-3′) at −2248 bp from the transcriptional start site of AtNCED3 is an important cis-acting element in the induction of the dehydration response. We discuss the possible transcriptional regulatory system of dehydration-responsive AtNCED3 expression, and how this may control the level of ABA under water-deficit conditions. PMID:23604098
A Lettuce (Lactuca sativa) Homolog of Human Nogo-B Receptor Interacts with cis-Prenyltransferase and Is Necessary for Natural Rubber Biosynthesis*

Science.gov (United States)

Qu, Yang; Chakrabarty, Romit; Tran, Hue T.; Kwon, Eun-Joo G.; Kwon, Moonhyuk; Nguyen, Trinh-Don; Ro, Dae-Kyun

2015-01-01

Natural rubber (cis-1,4-polyisoprene) is an indispensable biopolymer used to manufacture diverse consumer products. Although a major source of natural rubber is the rubber tree (Hevea brasiliensis), lettuce (Lactuca sativa) is also known to synthesize natural rubber. Here, we report that an unusual cis-prenyltransferase-like 2 (CPTL2) that lacks the conserved motifs of conventional cis-prenyltransferase is required for natural rubber biosynthesis in lettuce. CPTL2, identified from the lettuce rubber particle proteome, displays homology to a human NogoB receptor and is predominantly expressed in latex. Multiple transgenic lettuces expressing CPTL2-RNAi constructs showed that a decrease of CPTL2 transcripts (3–15% CPTL2 expression relative to controls) coincided with the reduction of natural rubber as low as 5%. We also identified a conventional cis-prenyltransferase 3 (CPT3), exclusively expressed in latex. In subcellular localization studies using fluorescent proteins, cytosolic CPT3 was relocalized to endoplasmic reticulum by co-occurrence of CPTL2 in tobacco and yeast at the log phase. Furthermore, yeast two-hybrid data showed that CPTL2 and CPT3 interact. Yeast microsomes containing CPTL2/CPT3 showed enhanced synthesis of short cis-polyisoprenes, but natural rubber could not be synthesized in vitro. Intriguingly, a homologous pair CPTL1/CPT1, which displays ubiquitous expressions in lettuce, showed a potent dolichol biosynthetic activity in vitro. Taken together, our data suggest that CPTL2 is a scaffolding protein that tethers CPT3 on endoplasmic reticulum and is necessary for natural rubber biosynthesis in planta, but yeast-expressed CPTL2 and CPT3 alone could not synthesize high molecular weight natural rubber in vitro. PMID:25477521

Cis-trans photoisomerization of abscisic acid

International Nuclear Information System (INIS)

Brabham, D.E.; Biggs, R.H.

1981-01-01

An important regulator of numerous physiological processes in higher plants is abscisic acid (ABA), which is photoisomerized from the more biologically active cis isomer to the nearly inactive trans isomer by natural sunlight. It is possible that this photoisomerization is a UV control mechanism in functions regulated by ABA. The quantum yields of both the cis to trans and trans to cis photoisomerizations were measured under various conditions of pH and oxygen concentration at room temperature. The yield for photoisomerization of cis-ABA ranged from 0.25 at pH 3.0 to 0.11 at pH 7.0. Oxygen partially quenched the process. The quantum yield varied only slightly with wavelength. The quantum yield of photolysis of cis-ABA was reported for pH 3.0 as 0.06. This yield also varied slightly with wavelength and was relatively insensitive to oxygen. This relatively high yield explains the loss of potency of ABA during UV irradiation. Phosphorescence of cis- and trans-ABA was observed in methanol at 77 K. Onset of the emission was at 350 nm. The emission spectra were the same for both isomers. From these results a mechanism of UV action on plants based on the photoisomerization of the inactive trans-ABA to the biologically active cis isomer is proposed. (author)
RNA regulatory elements and polyadenylation in plants

Directory of Open Access Journals (Sweden)

Arthur G. Hunt

2012-01-01

Full Text Available Alternative poly(A site choice (also known as alternative polyadenylation, or APA has the potential to affect gene expression in qualitative and quantitative ways. Alternative polyadenylation may affect as many as 82% of all expressed genes in a plant. The consequences of APA include the generation of transcripts with differing 3’-UTRs (and thus differing potential regulatory potential and of transcripts with differing protein-coding potential. Genome-wide studies of possible APA suggest a linkage with pre-mRNA splicing, and indicate a coincidence of and perhaps cooperation between RNA regulatory elements that affect splicing efficiency and the recognition of novel intronic poly(A sites. These studies also raise the possibility of the existence of a novel class of polyadenylation-related cis elements that are distinct from the well-characterized plant polyadenylation signal. Many potential APA events, however, have not been associated with identifiable cis elements. The present state of the field reveals a broad scope of APA, and also numerous opportunities for research into mechanisms that govern both choice and regulation of poly(A sites in plants.
Piecing together cis-regulatory networks: insights from epigenomics studies in plants.

Science.gov (United States)

Huang, Shao-Shan C; Ecker, Joseph R

2018-05-01

5-Methylcytosine, a chemical modification of DNA, is a covalent modification found in the genomes of both plants and animals. Epigenetic inheritance of phenotypes mediated by DNA methylation is well established in plants. Most of the known mechanisms of establishing, maintaining and modifying DNA methylation have been worked out in the reference plant Arabidopsis thaliana. Major functions of DNA methylation in plants include regulation of gene expression and silencing of transposable elements (TEs) and repetitive sequences, both of which have parallels in mammalian biology, involve interaction with the transcriptional machinery, and may have profound effects on the regulatory networks in the cell. Methylome and transcriptome dynamics have been investigated in development and environmental responses in Arabidopsis and agriculturally and ecologically important plants, revealing the interdependent relationship among genomic context, methylation patterns, and expression of TE and protein coding genes. Analyses of methylome variation among plant natural populations and species have begun to quantify the extent of genetic control of methylome variation vs. true epimutation, and model the evolutionary forces driving methylome evolution in both short and long time scales. The ability of DNA methylation to positively or negatively modulate binding affinity of transcription factors (TFs) provides a natural link from genome sequence and methylation changes to transcription. Technologies that allow systematic determination of methylation sensitivities of TFs, in native genomic and methylation context without confounding factors such as histone modifications, will provide baseline datasets for building cell-type- and individual-specific regulatory networks that underlie the establishment and inheritance of complex traits. This article is categorized under: Laboratory Methods and Technologies > Genetic/Genomic Methods Biological Mechanisms > Regulatory Biology. © 2017 Wiley
Evaluating cis-2,6-Dimethylpiperidide (cis-DMP) as a Base Component in Lithium-Mediated Zincation Chemistry

Science.gov (United States)

Armstrong, David R; Garden, Jennifer A; Kennedy, Alan R; Leenhouts, Sarah M; Mulvey, Robert E; O'Keefe, Philip; O'Hara, Charles T; Steven, Alan

2013-01-01

Most recent advances in metallation chemistry have centred on the bulky secondary amide 2,2,6,6-tetramethylpiperidide (TMP) within mixed metal, often ate, compositions. However, the precursor amine TMP(H) is rather expensive so a cheaper substitute would be welcome. Thus this study was aimed towards developing cheaper non-TMP based mixed-metal bases and, as cis-2,6-dimethylpiperidide (cis-DMP) was chosen as the alternative amide, developing cis-DMP zincate chemistry which has received meagre attention compared to that of its methyl-rich counterpart TMP. A new lithium diethylzincate, [(TMEDA)LiZn(cis-DMP)Et2] (TMEDA=N,N,N′,N′-tetramethylethylenediamine) has been synthesised by co-complexation of Li(cis-DMP), Et2Zn and TMEDA, and characterised by NMR (including DOSY) spectroscopy and X-ray crystallography, which revealed a dinuclear contact ion pair arrangement. By using N,N-diisopropylbenzamide as a test aromatic substrate, the deprotonative reactivity of [(TMEDA)LiZn(cis-DMP)Et2] has been probed and contrasted with that of the known but previously uninvestigated di-tert-butylzincate, [(TMEDA)LiZn(cis-DMP)tBu2]. The former was found to be the superior base (for example, producing the ortho-deuteriated product in respective yields of 78 % and 48 % following D2O quenching of zincated benzamide intermediates). An 88 % yield of 2-iodo-N,N-diisopropylbenzamide was obtained on reaction of two equivalents of the diethylzincate with the benzamide followed by iodination. Comparisons are also drawn using 1,1,1,3,3,3-hexamethyldisilazide (HMDS), diisopropylamide and TMP as the amide component in the lithium amide, Et2Zn and TMEDA system. Under certain conditions, the cis-DMP base system was found to give improved results in comparison to HMDS and diisopropylamide (DA), and comparable results to a TMP system. Two novel complexes isolated from reactions of the di-tert-butylzincate and crystallographically characterised, namely the pre-metallation complex [{(iPr)2N(Ph)C=O}LiZn(cis
Helping Students Understand Gene Regulation with Online Tools: A Review of MEME and Melina II, Motif Discovery Tools for Active Learning in Biology

Directory of Open Access Journals (Sweden)

David Treves

2012-08-01

Full Text Available Review of: MEME and Melina II, which are two free and easy-to-use online motif discovery tools that can be employed to actively engage students in learning about gene regulatory elements.
cWords - systematic microRNA regulatory motif discovery from mRNA expression data

DEFF Research Database (Denmark)

Rasmussen, Simon Horskjær; Jacobsen, Anders; Krogh, Anders

2013-01-01

and statistical methods of cWords, resulting in at least a factor 100 speed gain over the previous implementation. On a benchmark dataset of 19 microRNA (miRNA) perturbation experiments cWords showed equal or better performance than two comparable methods, miReduce and Sylamer. We have developed rigorous motif...... that demonstrate comparable or better performance than other existing methods. Rich visualization of results promotes intuitive and efficient interpretation of data. cWords is available as a stand-alone Open Source program at Github https://github.com/simras/cWords webcite and as a web-service at: http...
Comparative evaluation of capillary electrophoresis and high-performance liquid chromatography for the separation of cis-cis, cis-trans, and trans-trans isomers of atracurium besylate.

Science.gov (United States)

de Moraes, M de L; Polakiewicz, B; Mattua, M F; Tavares, M F

1998-01-01

Atracurium besylate is a highly selective nondepolarizing neuromuscular blocking agent routinely used during anesthetic procedures. The commercial presentation of this drug is a mixture of positional isomers, cis-cis, cis-trans, and trans-trans. Reversed-phase high-performance liquid chromatography has been the technique of choice for the analysis of atracurium besylate formulations at the quality control laboratory of Núcleo de Desenvolvimento Cristália (São Paulo, Brazil), a local pharmaceutical company. HPLC analysis is usually conducted under gradient elution using acetonitrile/0.1 M phosphate buffer eluent mixture as mobile phase and an octadecyl silica (ODS)-packed column. The complete elution of the three isomers takes about 1 hr. In this work, an alternative capillary electrophoresis methodology was developed. The complete resolution of all three isomers was accomplished in about 13 min (+20 kV/72 cm, 211 nm direct detection) using a 60-mM phosphate buffer solution (pH 4) containing 20 mM beta-cyclodextrin and 4 M urea. The isomer ratio was found to be 59.1% cis-cis, 35.9% cis-trans, and 5.02% trans-trans (expected ratio: 59:35:6). Laudanosine, a major metabolite of atracurium besylate, was identified in two commercially available formulations, Tracur (Núcleo de Desenvolvimento Cristália) and Tracrium (Glaxo Wellcome, S.A., Rio de Janeiro, Brazil). Its concentration increases considerably during storage of the product, even if the product is stored at low temperatures.
Statistical tests to compare motif count exceptionalities

Directory of Open Access Journals (Sweden)

Vandewalle Vincent

2007-03-01

Full Text Available Abstract Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.
[Regulatory effect and mechanism of RNA binding motif protein 38 on the expression of progesterone receptor in human breast cancer ZR-75-1 cells].

Science.gov (United States)

Lou, P P; Li, C L; Xia, T S; Shi, L; Wu, J; Zhou, X J; Wang, Y; Ding, Q

2016-06-23

To investigate the regulatory mechanism of RNA binding motif protein 38 (RNPC1) on the expression of progesterone receptor (PR) in breast cancer cell line ZR-75-1. Lentiviral vector was used to induce overexpression of RNPC1 in ZR-75-1 cells. qRT-PCR and Western blot were used to assess the regulatory effect of RNPC1 on PR expression. Actinomycin was used to detect the regulatory mechanism involved. Immunohistochemical (IHC) staining was used to determine the protein expression of RNPC1 and PR in 80 breast cancer tissues. IHC staining showed that the expression of RNPC1 was significantly higher in the PR positive breast cancer tissues than that in the PR negative breast cancer tissues (P<0.05). The qRT-PCR results showed that overexpression of RNPC1 in ZR-75-1 cells significantly upregulated the mRNA level of PR (1.764±0.028 vs. 1.001±0.037, P<0.01), whereas knockdown of RNPC1 did the opposite (0.579± 0.007 vs. 1.000±0.002, P<0.01). The Western blot results also showed that overexpression of RNPC1 up-regulated PR levels, while knockdown of RNPC1 resulted in down-regulation of PR levels in the ZR-75-1 cells.The actinomycin assay showed that overexpression of RNPC1 increased the mRNA stability of PR. The half-life of PR mRNA was increased from 4.0 h to 6.5 h. Knockdown of RNPC1 decreased the mRNA stability of PR and the half-life of PR transcript was decreased from 4.1 h to 3.0 h. RNPC1 plays a crucial role in regulating the expression of PR in breast cancer ZR-75-1 cells.
Fluorinated cyclohexanes: Synthesis of amine building blocks of the all-cis 2,3,5,6-tetrafluorocyclohexylamine motif

Directory of Open Access Journals (Sweden)

Tetiana Bykova

2017-04-01

Full Text Available This paper reports the synthesis of three amine stereoisomers 5a–c of the tetrafluorocyclohexyl ring system, as building blocks for discovery chemistry programmes. The synthesis starts from a Birch reduction of benzonitrile, followed by an in situ methyl iodide quench. The resultant 2,5-cyclohexadiene was progressed via double epoxidations and then hydrofluorination ring opening reactions. The resultant fluorohydrin moieties were then converted to different stereoisomers of the tetrafluorocyclohexyl ring system, and then reductive hydrogenation of the nitrile delivered three amine stereoisomers. It proved necessary to place a methyl group on the cyclohexane ring in order to stabilise the compound against subsequent HF elimination. The two all-cis tetrafluorocyclohexyl isomers 5a and 5b constitute facially polarized cyclohexane rings, with fluorines on the electronegative face and hydrogens on the electropositive face.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

Science.gov (United States)

Sebastian, Alvaro; Contreras-Moreira, Bruno

2014-01-15

Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Identity and functions of CxxC-derived motifs.

Science.gov (United States)

Fomenko, Dmitri E; Gladyshev, Vadim N

2003-09-30

Two cysteines separated by two other residues (the CxxC motif) are employed by many redox proteins for formation, isomerization, and reduction of disulfide bonds and for other redox functions. The place of the C-terminal cysteine in this motif may be occupied by serine (the CxxS motif), modifying the functional repertoire of redox proteins. Here we found that the CxxC motif may also give rise to a motif, in which the C-terminal cysteine is replaced with threonine (the CxxT motif). Moreover, in contrast to a view that the N-terminal cysteine in the CxxC motif always serves as a nucleophilic attacking group, this residue could also be replaced with threonine (the TxxC motif), serine (the SxxC motif), or other residues. In each of these CxxC-derived motifs, the presence of a downstream alpha-helix was strongly favored. A search for conserved CxxC-derived motif/helix patterns in four complete genomes representing bacteria, archaea, and eukaryotes identified known redox proteins and suggested possible redox functions for several additional proteins. Catalytic sites in peroxiredoxins were major representatives of the TxxC motif, whereas those in glutathione peroxidases represented the CxxT motif. Structural assessments indicated that threonines in these enzymes could stabilize catalytic thiolates, suggesting revisions to previously proposed catalytic triads. Each of the CxxC-derived motifs was also observed in natural selenium-containing proteins, in which selenocysteine was present in place of a catalytic cysteine.
Motif decomposition of the phosphotyrosine proteome reveals a new N-terminal binding motif for SHIP2

DEFF Research Database (Denmark)

Miller, Martin Lee; Hanke, S.; Hinsby, A. M.

2008-01-01

set of 481 unique phosphotyrosine (Tyr(P)) peptides by sequence similarity to known ligands of the Src homology 2 (SH2) and the phosphotyrosine binding (PTB) domains. From 20 clusters we extracted 16 known and four new interaction motifs. Using quantitative mass spectrometry we pulled down Tyr......(P)-specific binding partners for peptides corresponding to the extracted motifs. We confirmed numerous previously known interaction motifs and found 15 new interactions mediated by phosphosites not previously known to bind SH2 or PTB. Remarkably, a novel hydrophobic N-terminal motif ((L/V/I)(L/V/I)pY) was identified...
A role for circadian evening elements in cold-regulated gene expression in Arabidopsis.

Science.gov (United States)

Mikkelsen, Michael D; Thomashow, Michael F

2009-10-01

The plant transcriptome is dramatically altered in response to low temperature. The cis-acting DNA regulatory elements and trans-acting factors that regulate the majority of cold-regulated genes are unknown. Previous bioinformatic analysis has indicated that the promoters of cold-induced genes are enriched in the Evening Element (EE), AAAATATCT, a DNA regulatory element that has a role in circadian-regulated gene expression. Here we tested the role of EE and EE-like (EEL) elements in cold-induced expression of two Arabidopsis genes, CONSTANS-like 1 (COL1; At5g54470) and a gene encoding a 27-kDa protein of unknown function that we designated COLD-REGULATED GENE 27 (COR27; At5g42900). Mutational analysis indicated that the EE/EEL elements were required for cold induction of COL1 and COR27, and that their action was amplified through coupling with ABA response element (ABRE)-like (ABREL) motifs. An artificial promoter consisting solely of four EE motifs interspersed with three ABREL motifs was sufficient to impart cold-induced gene expression. Both COL1 and COR27 were found to be regulated by the circadian clock at warm growth temperatures and cold-induction of COR27 was gated by the clock. These results suggest that cold- and clock-regulated gene expression are integrated through regulatory proteins that bind to EE and EEL elements supported by transcription factors acting at ABREL sequences. Bioinformatic analysis indicated that the coupling of EE and EEL motifs with ABREL motifs is highly enriched in cold-induced genes and thus may constitute a DNA regulatory element pair with a significant role in configuring the low-temperature transcriptome.
Interaction between two cis-acting elements, ABRE and DRE, in ABA-dependent expression of Arabidopsis rd29A gene in response to dehydration and high-salinity stresses.

Science.gov (United States)

Narusaka, Yoshihiro; Nakashima, Kazuo; Shinwari, Zabta K; Sakuma, Yoh; Furihata, Takashi; Abe, Hiroshi; Narusaka, Mari; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2003-04-01

Many abiotic stress-inducible genes contain two cis-acting elements, namely a dehydration-responsive element (DRE; TACCGACAT) and an ABA-responsive element (ABRE; ACGTGG/TC), in their promoter regions. We precisely analyzed the 120 bp promoter region (-174 to -55) of the Arabidopsis rd29A gene whose expression is induced by dehydration, high-salinity, low-temperature, and abscisic acid (ABA) treatments and whose 120 bp promoter region contains the DRE, DRE/CRT-core motif (A/GCCGAC), and ABRE sequences. Deletion and base substitution analyses of this region showed that the DRE-core motif functions as DRE and that the DRE/DRE-core motif could be a coupling element of ABRE. Gel mobility shift assays revealed that DRE-binding proteins (DREB1s/CBFs and DREB2s) bind to both DRE and the DRE-core motif and that ABRE-binding proteins (AREBs/ABFs) bind to ABRE in the 120 bp promoter region. In addition, transactivation experiments using Arabidopsis leaf protoplasts showed that DREBs and AREBs cumulatively transactivate the expression of a GUS reporter gene fused to the 120 bp promoter region of rd29A. These results indicate that DRE and ABRE are interdependent in the ABA-responsive expression of the rd29A gene in response to ABA in Arabidopsis.
10 points about buying C.I.S

International Nuclear Information System (INIS)

Anon.

1993-01-01

On October 16, 1992, the U.S. Department of Commerce (DOC) settled the antidumping case against the CIS republics by imposing price and volume quotas on CIS uranium imported into the United States. Bound by a suspension agreement, each of the six uranium-producing CIS republics is responsible for restricting the flow of imports to the US-either directly or indirectly. (As the NUKEM Market Report went to press, the Ukraine government notified the DOC of its intent not to terminate the suspension agreement.) This action is to prevent undercutting price levels in the US domestic uranium markets. What follows are ten points about everything you should know about importing uranium from the uranium-producing CIS republics- Kazakhstan, Kyrgyzstan, Russian Federation, Tajikistan, Ukraine and Uzbekistan. Newcomers to the CIS scene should follow this simple roadmap and be aware of the issues they face as importers in terms of Commerce/Customs requirements and documentation and where to get them, when to buy the material and how to transport it, how to deal effectively with CIS exporters, and how to avoid unnecessary complications when buying CIS
[Personal motif in art].

Science.gov (United States)

Gerevich, József

2015-01-01

One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.
Temporal motifs in time-dependent networks

International Nuclear Information System (INIS)

Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

2011-01-01

Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological–temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network
Estimation and inference in the same-different test

DEFF Research Database (Denmark)

Christensen, Rune Haubo Bojesen; Brockhoff, Per B.

2009-01-01

as well as similarity. We show that the likelihood root statistic is equivalent to the well known G(2) likelihood ratio statistic for tests of no difference. As an additional practical tool, we introduce the profile likelihood curve to provide a convenient graphical summary of the information in the data......Inference for the Thurstonian delta in the same-different protocol via the well known Wald statistic is shown to be inappropriate in a wide range of situations. We introduce the likelihood root statistic as an alternative to the Wald statistic to produce CIs and p-values for assessing difference...
A lettuce (Lactuca sativa) homolog of human Nogo-B receptor interacts with cis-prenyltransferase and is necessary for natural rubber biosynthesis.

Science.gov (United States)

Qu, Yang; Chakrabarty, Romit; Tran, Hue T; Kwon, Eun-Joo G; Kwon, Moonhyuk; Nguyen, Trinh-Don; Ro, Dae-Kyun

2015-01-23

Natural rubber (cis-1,4-polyisoprene) is an indispensable biopolymer used to manufacture diverse consumer products. Although a major source of natural rubber is the rubber tree (Hevea brasiliensis), lettuce (Lactuca sativa) is also known to synthesize natural rubber. Here, we report that an unusual cis-prenyltransferase-like 2 (CPTL2) that lacks the conserved motifs of conventional cis-prenyltransferase is required for natural rubber biosynthesis in lettuce. CPTL2, identified from the lettuce rubber particle proteome, displays homology to a human NogoB receptor and is predominantly expressed in latex. Multiple transgenic lettuces expressing CPTL2-RNAi constructs showed that a decrease of CPTL2 transcripts (3-15% CPTL2 expression relative to controls) coincided with the reduction of natural rubber as low as 5%. We also identified a conventional cis-prenyltransferase 3 (CPT3), exclusively expressed in latex. In subcellular localization studies using fluorescent proteins, cytosolic CPT3 was relocalized to endoplasmic reticulum by co-occurrence of CPTL2 in tobacco and yeast at the log phase. Furthermore, yeast two-hybrid data showed that CPTL2 and CPT3 interact. Yeast microsomes containing CPTL2/CPT3 showed enhanced synthesis of short cis-polyisoprenes, but natural rubber could not be synthesized in vitro. Intriguingly, a homologous pair CPTL1/CPT1, which displays ubiquitous expressions in lettuce, showed a potent dolichol biosynthetic activity in vitro. Taken together, our data suggest that CPTL2 is a scaffolding protein that tethers CPT3 on endoplasmic reticulum and is necessary for natural rubber biosynthesis in planta, but yeast-expressed CPTL2 and CPT3 alone could not synthesize high molecular weight natural rubber in vitro. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

cis-Bifenthrin enantioselectively induces hepatic oxidative stress in mice.

Science.gov (United States)

Jin, Yuanxiang; Wang, Jiangcong; Pan, Xiuhong; Wang, Linggang; Fu, Zhengwei

2013-09-01

Bifenthrin (BF), as a chiral synthetic pyrethroid, is widely used to control field and household pests. In China, the commercial cis-BF contained two enantiomers including 1R-cis-BF and 1S-cis-BF. However, the difference in oxidative stress induced by the two enantiomers in mice still remains unclear. In the present study, 4 week-old adolescent male ICR mice were orally administered cis-BF, 1R-cis-BF or 1S-cis-BF daily for 2, 4 and 6 weeks at doses of 5 mg/kg/day, respectively. We found that the hepatic reactive oxygen species (ROS) levels, as well as the malondialdehyde (MDA) and glutathione (GSH) content both in the serum and liver increased significantly in the 4 or 6 weeks 1S-cis-BF treated groups. The activities of superoxide dismutase (SOD) and catalase (CAT) also changed significantly in the serum and liver of 1S-cis-BF treated mice. More importantly, the significant differences in MDA content and CAT activity both in the serum and liver, and the activities of total antioxidant capacity (T-AOC) and SOD in serum were also observed between the 1S-cis-BF and 1R-cis-BF treated groups. Moreover, the transcription of oxidative stress response related genes including Sod1, Cat and heme oxygenase-1(Ho-1) in the liver of 1S-cis-BF treated groups were also significant higher than those in 1R-cis-BF treated group. Thus, it was concluded that cis-BF induced hepatic oxidative stress in an enantiomer specific manner in mice when exposed during the puberty, and that 1S-cis-BF showed much more toxic in hepatic oxidative stress than 1R-cis-BF. Copyright © 2013 Elsevier Inc. All rights reserved.
CisLunar Habitat Internal Architecture Design Criteria

Science.gov (United States)

Jones, R.; Kennedy, K.; Howard, R.; Whitmore, M.; Martin, C.; Garate, J.

2017-01-01

BACKGROUND: In preparation for human exploration to Mars, there is a need to define the development and test program that will validate deep space operations and systems. In that context, a Proving Grounds CisLunar habitat spacecraft is being defined as the next step towards this goal. This spacecraft will operate differently from the ISS or other spacecraft in human history. The performance envelope of this spacecraft (mass, volume, power, specifications, etc.) is being defined by the Future Capabilities Study Team. This team has recognized the need for a human-centered approach for the internal architecture of this spacecraft and has commissioned a CisLunar Phase-1 Habitat Internal Architecture Study Team to develop a NASA reference configuration, providing the Agency with a "smart buyer" approach for future acquisition. THE CISLUNAR HABITAT INTERNAL ARCHITECTURE STUDY: Overall, the CisLunar Habitat Internal Architecture study will address the most significant questions and risks in the current CisLunar architecture, habitation, and operations concept development. This effort is achieved through definition of design criteria, evaluation criteria and process, design of the CisLunar Habitat Phase-1 internal architecture, and the development and fabrication of internal architecture concepts combined with rigorous and methodical Human-in-the-Loop (HITL) evaluations and testing of the conceptual innovations in a controlled test environment. The vision of the CisLunar Habitat Internal Architecture Study is to design, build, and test a CisLunar Phase-1 Habitat Internal Architecture that will be used for habitation (e.g. habitability and human factors) evaluations. The evaluations will mature CisLunar habitat evaluation tools, guidelines, and standards, and will interface with other projects such as the Advanced Exploration Systems (AES) Program integrated Power, Avionics, Software (iPAS), and Logistics for integrated human-in-the-loop testing. The mission of the Cis
Genome-wide identification and quantification of cis- and trans-regulated genes responding to Marek’s disease virus infection via analysis of allele-specific expression

Directory of Open Access Journals (Sweden)

Sean eMaceachern

2012-01-01

Full Text Available Marek’s disease (MD is a commercially important neoplastic disease of chickens caused by Marek’s disease virus (MDV, an oncogenic alphaherpesvirus. Selecting for increased genetic resistance to MD is a control strategy that can augment vaccinal control measures. To identify high-confidence candidate MD resistance genes, we conducted a genome-wide screen for allele-specific expression (ASE amongst F1 progeny of two inbred chicken lines that differ in MD resistance. High throughput sequencing was used to profile transcriptomes from pools of uninfected and infected individuals at 4 days post-infection to identify any genes showing ASE in response to MDV infection. RNA sequencing identified 22,655 single nucleotide polymorphisms (SNPs of which 5,360 in 3,773 genes exhibited significant allelic imbalance. Illumina GoldenGate assays were subsequently used to quantify regulatory variation controlled at the gene (cis and elsewhere in the genome (trans by examining differences in expression between F1 individuals and artificial F1 RNA pools over 6 time periods in 1,536 of the most significant SNPs identified by RNA sequencing. Allelic imbalance as a result of cis-regulatory changes was confirmed in 861 of the 1,233 GoldenGate assays successfully examined. Furthermore we have identified 7 genes that display trans-regulation only in infected animals and approximately 500 SNP that show a complex interaction between cis- and trans-regulatory changes. Our results indicate ASE analyses are a powerful approach to identify regulatory variation responsible for differences in transcript abundance in genes underlying complex traits. And the genes with SNPs exhibiting ASE provide a strong foundation to further investigate the causative polymorphisms and genetic mechanisms for MD resistance. Finally, the methods used here for identifying specific genes and SNPs may have practical implications for applying marker-assisted selection to complex traits that are
OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks.

Science.gov (United States)

Lim, Néhémy; Senbabaoglu, Yasin; Michailidis, George; d'Alché-Buc, Florence

2013-06-01

Reverse engineering of gene regulatory networks remains a central challenge in computational systems biology, despite recent advances facilitated by benchmark in silico challenges that have aided in calibrating their performance. A number of approaches using either perturbation (knock-out) or wild-type time-series data have appeared in the literature addressing this problem, with the latter using linear temporal models. Nonlinear dynamical models are particularly appropriate for this inference task, given the generation mechanism of the time-series data. In this study, we introduce a novel nonlinear autoregressive model based on operator-valued kernels that simultaneously learns the model parameters, as well as the network structure. A flexible boosting algorithm (OKVAR-Boost) that shares features from L2-boosting and randomization-based algorithms is developed to perform the tasks of parameter learning and network inference for the proposed model. Specifically, at each boosting iteration, a regularized Operator-valued Kernel-based Vector AutoRegressive model (OKVAR) is trained on a random subnetwork. The final model consists of an ensemble of such models. The empirical estimation of the ensemble model's Jacobian matrix provides an estimation of the network structure. The performance of the proposed algorithm is first evaluated on a number of benchmark datasets from the DREAM3 challenge and then on real datasets related to the In vivo Reverse-Engineering and Modeling Assessment (IRMA) and T-cell networks. The high-quality results obtained strongly indicate that it outperforms existing approaches. The OKVAR-Boost Matlab code is available as the archive: http://amis-group.fr/sourcecode-okvar-boost/OKVARBoost-v1.0.zip. Supplementary data are available at Bioinformatics online.
Discovering regulatory motifs in the Plasmodium genome using comparative genomics

OpenAIRE

Wu, Jie; Sieglaff, Douglas H.; Gervin, Joshua; Xie, Xiaohui S.

2008-01-01

Motivation: Understanding gene regulation in Plasmodium, the causative agent of malaria, is an important step in deciphering its complex life cycle as well as leading to possible new targets for therapeutic applications. Very little is known about gene regulation in Plasmodium, and in particular, few regulatory elements have been identified. Such discovery has been significantly hampered by the high A-T content of some of the genomes of Plasmodium species, as well as the challenge in associat...
Transcription regulatory networks analysis using CAGE

KAUST Repository

Tegnér, Jesper N.

2009-10-01

Mapping out cellular networks in general and transcriptional networks in particular has proved to be a bottle-neck hampering our understanding of biological processes. Integrative approaches fusing computational and experimental technologies for decoding transcriptional networks at a high level of resolution is therefore of uttermost importance. Yet, this is challenging since the control of gene expression in eukaryotes is a complex multi-level process influenced by several epigenetic factors and the fine interplay between regulatory proteins and the promoter structure governing the combinatorial regulation of gene expression. In this chapter we review how the CAGE data can be integrated with other measurements such as expression, physical interactions and computational prediction of regulatory motifs, which together can provide a genome-wide picture of eukaryotic transcriptional regulatory networks at a new level of resolution. © 2010 by Pan Stanford Publishing Pte. Ltd. All rights reserved.
oPOSSUM: integrated tools for analysis of regulatory motif over-representation

Science.gov (United States)

Ho Sui, Shannan J.; Fulton, Debra L.; Arenillas, David J.; Kwon, Andrew T.; Wasserman, Wyeth W.

2007-01-01

The identification of over-represented transcription factor binding sites from sets of co-expressed genes provides insights into the mechanisms of regulation for diverse biological contexts. oPOSSUM, an internet-based system for such studies of regulation, has been improved and expanded in this new release. New features include a worm-specific version for investigating binding sites conserved between Caenorhabditis elegans and C. briggsae, as well as a yeast-specific version for the analysis of co-expressed sets of Saccharomyces cerevisiae genes. The human and mouse applications feature improvements in ortholog mapping, sequence alignments and the delineation of multiple alternative promoters. oPOSSUM2, introduced for the analysis of over-represented combinations of motifs in human and mouse genes, has been integrated with the original oPOSSUM system. Analysis using user-defined background gene sets is now supported. The transcription factor binding site models have been updated to include new profiles from the JASPAR database. oPOSSUM is available at http://www.cisreg.ca/oPOSSUM/ PMID:17576675
The regulatory G4 motif of the Kirsten ras (KRAS) gene is sensitive to guanine oxidation

DEFF Research Database (Denmark)

Cogoi, Susanna; Ferino, Annalisa; Miglietta, Giulia

2018-01-01

KRAS is one of the most mutated genes in human cancer. It is controlled by a G4 motif located upstream of the transcription start site. In this paper, we demonstrate that 8-oxoguanine (8-oxoG), being more abundant in G4 than in non-G4 regions, is a new player in the regulation of this oncogene. W...
UKIRAN KERAWANG ACEH GAYO SEBAGAI INSPIRASI PENCIPTAAN MOTIF BATIK KHAS GAYO

Directory of Open Access Journals (Sweden)

Irfa ina Rohana Salma

2016-12-01

Full Text Available ABSTRAK Industri batik mulai berkembang di Gayo, tetapi belum memiliki motif batik khas daerah. Oleh karena itu perlu diciptakan motif batik khas Gayo, dengan mengambil inspirasi dari ukiran yang terdapat pada rumah tradisional yang biasa disebut ukiran kerawang Gayo. Tujuan penciptaan seni ini adalah untuk menciptakan motif batik yang memiliki ciri khas Gayo. Metode yang digunakan yaitu eksplorasi ide, perancangan, dan perwujudan menjadi motif batik. Dalam kegiatan ini telah diciptakan enam motif batik khas Gayo yaitu: (1 Motif Ceplok Gayo; (2 Motif Gayo Tegak; (3 Motif Gayo Lurus; (4 Motif Parang Gayo; (5 Motif Gayo Lembut; dan (6 Motif Geometris Gayo. Hasil uji kesukaan terhadap motif kepada lima puluh responden menunjukkan bahwa Motif Ceplok Gayo paling banyak dipilih oleh responden yaitu sebesar 19%, sedangkan Motif Parang Gayo 18%, Motif Gayo Lembut 17%, Motif Geometris Gayo 17%, Motif Gayo Lurus 15% dan Motif Gayo Tegak 14%. Rata-rata motif yang dihasilkan mendapatkan apresiasi yang baik dari responden, sehingga semua motif layak diproduksi sebagai batik khas Gayo.Kata kunci: batik Gayo, Motif Ceplok Gayo, Motif Parang Gayo.ABSTRACTBatik industry began to develop in Gayo, but have not had a typical batik motif itself. Therefore, it is necessary to create batik motifs of Gayo, by taking inspiration from the carvings found in traditional houses commonly called kerawang Gayo. The purpose of this art is to create motifs those have a Gayo characteristic. The method used are the idea exploration, design, and motifs embodiment. In this activity has created six Gayo batik motifs, namely: (1 Motif Ceplok Gayo; (2 Motif Gayo Tegak; (3 Motif GayoLurus; (4 Motif Parang Gayo; (5 Motif Gayo Lembut; dan (6 Motif Geometris Gayo. The test results fondness of the motives to fifty respondents indicated that the Motif Ceplok Gayo most preferred by respondents ie 19%, while Motif Parang Gayo 18%, Motif Gayo Lembut 17%, Motif Geometris Gayo 17%, Motif Gayo
Dilemmas of compliance in the CIS

International Nuclear Information System (INIS)

Vorobyev, A.

1996-01-01

The objective of this paper is to examine some of the difficulties faced by Russia and other Common Independent States (CIS) in the field of compliance with disarmament treaties and non-proliferation regimes, as well as ways and means, particularly with regard to the legal framework, designed to overcome these difficulties. Naturally, the fate and pace of overcoming the existing problems will depend only partially on development of CIS States. A large variety of international factors and the general security will be essential for progress in resolving disarmament and arms control issues in the CIS
A robust approach to identifying tissue-specific gene expression regulatory variants using personalized human induced pluripotent stem cells.

Directory of Open Access Journals (Sweden)

Je-Hyuk Lee

2009-11-01

Full Text Available Normal variation in gene expression due to regulatory polymorphisms is often masked by biological and experimental noise. In addition, some regulatory polymorphisms may become apparent only in specific tissues. We derived human induced pluripotent stem (iPS cells from adult skin primary fibroblasts and attempted to detect tissue-specific cis-regulatory variants using in vitro cell differentiation. We used padlock probes and high-throughput sequencing for digital RNA allelotyping and measured allele-specific gene expression in primary fibroblasts, lymphoblastoid cells, iPS cells, and their differentiated derivatives. We show that allele-specific expression is both cell type and genotype-dependent, but the majority of detectable allele-specific expression loci remains consistent despite large changes in the cell type or the experimental condition following iPS reprogramming, except on the X-chromosome. We show that our approach to mapping cis-regulatory variants reduces in vitro experimental noise and reveals additional tissue-specific variants using skin-derived human iPS cells.
Role of the Box C/D Motif in Localization of Small Nucleolar RNAs to Coiled Bodies and Nucleoli

Science.gov (United States)

Narayanan, Aarthi; Speckmann, Wayne; Terns, Rebecca; Terns, Michael P.

1999-01-01

Small nucleolar RNAs (snoRNAs) are a large family of eukaryotic RNAs that function within the nucleolus in the biogenesis of ribosomes. One major class of snoRNAs is the box C/D snoRNAs named for their conserved box C and box D sequence elements. We have investigated the involvement of cis-acting sequences and intranuclear structures in the localization of box C/D snoRNAs to the nucleolus by assaying the intranuclear distribution of fluorescently labeled U3, U8, and U14 snoRNAs injected into Xenopus oocyte nuclei. Analysis of an extensive panel of U3 RNA variants showed that the box C/D motif, comprised of box C′, box D, and the 3′ terminal stem of U3, is necessary and sufficient for the nucleolar localization of U3 snoRNA. Disruption of the elements of the box C/D motif of U8 and U14 snoRNAs also prevented nucleolar localization, indicating that all box C/D snoRNAs use a common nucleolar-targeting mechanism. Finally, we found that wild-type box C/D snoRNAs transiently associate with coiled bodies before they localize to nucleoli and that variant RNAs that lack an intact box C/D motif are detained within coiled bodies. These results suggest that coiled bodies play a role in the biogenesis and/or intranuclear transport of box C/D snoRNAs. PMID:10397754
Up front in the CIS

International Nuclear Information System (INIS)

Grey, C.A.

1994-01-01

A picture is drawn of the current supply side of the front-end fuel cycle production capacities in the CIS. Uranium production has been steadily declining, as in the West. Market realities have been reflected in local costs of production since the break-up of the former Soviet Union and some uneconomic mines have been closed. In terms of actual production, Kazakhstan, Russia and Uzbekistan, remain among the top five uranium producers in the world. Western government action has been taken to restrict the market access for natural uranium from the CIS. Reactors in the CIS continue to be supplied with fabricated fuel solely by Russian, though Western fuel fabricators have reduced Russian supplies to Eastern Europe. Russia's current dominance in conversion and enrichment services in both the CIS and Eastern Europe is likely to continue as long as the present surplus low enriched uranium stocks last and surplus production capacity exists. Market penetration in the West has been limited by government action but Russia in 1993 still held about 20% of the world's conversion market and nearly 19% of the enrichment market. (6 figures, 2 tables, 4 references) (UK)
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity.

Science.gov (United States)

Pecevski, Dejan; Maass, Wolfgang

2016-01-01

Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p (*) that generates the examples it receives. This holds even if p (*) contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference.
MHC motif viewer

DEFF Research Database (Denmark)

Rapin, Nicolas Philippe Jean-Pierre; Hoof, Ilka; Lund, Ole

2008-01-01

. Algorithms that predict which peptides MHC molecules bind have recently been developed and cover many different alleles, but the utility of these algorithms is hampered by the lack of tools for browsing and comparing the specificity of these molecules. We have, therefore, developed a web server, MHC motif....... A special viewing feature, MHC fight, allows for display of the specificity of two different MHC molecules side by side. We show how the web server can be used to discover and display surprising similarities as well as differences between MHC molecules within and between different species. The MHC motif...
Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

Science.gov (United States)

Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

2009-01-01

Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary
Motif discovery in ranked lists of sequences

DEFF Research Database (Denmark)

Nielsen, Morten Muhlig; Tataru, Paula; Madsen, Tobias

2016-01-01

Motif analysis has long been an important method to characterize biological functionality and the current growth of sequencing-based genomics experiments further extends its potential. These diverse experiments often generate sequence lists ranked by some functional property. There is therefore...... advantage of the regular expression feature, including enrichments for combinations of different microRNA seed sites. The method is implemented and made publicly available as an R package and supports high parallelization on multi-core machinery....... a growing need for motif analysis methods that can exploit this coupled data structure and be tailored for specific biological questions. Here, we present an exploratory motif analysis tool, Regmex (REGular expression Motif EXplorer), which offers several methods to evaluate the correlation of motifs...
Deciphering functional glycosaminoglycan motifs in development.

Science.gov (United States)

Townley, Robert A; Bülow, Hannes E

2018-03-23

Glycosaminoglycans (GAGs) such as heparan sulfate, chondroitin/dermatan sulfate, and keratan sulfate are linear glycans, which when attached to protein backbones form proteoglycans. GAGs are essential components of the extracellular space in metazoans. Extensive modifications of the glycans such as sulfation, deacetylation and epimerization create structural GAG motifs. These motifs regulate protein-protein interactions and are thereby repsonsible for many of the essential functions of GAGs. This review focusses on recent genetic approaches to characterize GAG motifs and their function in defined signaling pathways during development. We discuss a coding approach for GAGs that would enable computational analyses of GAG sequences such as alignments and the computation of position weight matrices to describe GAG motifs. Copyright © 2018 Elsevier Ltd. All rights reserved.
Fitness for synchronization of network motifs

DEFF Research Database (Denmark)

Vega, Y.M.; Vázquez-Prada, M.; Pacheco, A.F.

2004-01-01

We study the synchronization of Kuramoto's oscillators in small parts of networks known as motifs. We first report on the system dynamics for the case of a scale-free network and show the existence of a non-trivial critical point. We compute the probability that network motifs synchronize, and fi...... that the fitness for synchronization correlates well with motifs interconnectedness and structural complexity. Possible implications for present debates about network evolution in biological and other systems are discussed....
Characterization of noncoding regulatory DNA in the human genome.

Science.gov (United States)

Elkon, Ran; Agami, Reuven

2017-08-08

Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.

Comparison of loline alkaloid gene clusters across fungal endophytes: predicting the co-regulatory sequence motifs and the evolutionary history.

Science.gov (United States)

Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H

2007-10-01

LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.
A method of predicting changes in human gene splicing induced by genetic variants in context of cis-acting elements

Directory of Open Access Journals (Sweden)

Hicks Chindo

2010-01-01

Full Text Available Abstract Background Polymorphic variants and mutations disrupting canonical splicing isoforms are among the leading causes of human hereditary disorders. While there is a substantial evidence of aberrant splicing causing Mendelian diseases, the implication of such events in multi-genic disorders is yet to be well understood. We have developed a new tool (SpliceScan II for predicting the effects of genetic variants on splicing and cis-regulatory elements. The novel Bayesian non-canonical 5'GC splice site (SS sensor used in our tool allows inference on non-canonical exons. Results Our tool performed favorably when compared with the existing methods in the context of genes linked to the Autism Spectrum Disorder (ASD. SpliceScan II was able to predict more aberrant splicing isoforms triggered by the mutations, as documented in DBASS5 and DBASS3 aberrant splicing databases, than other existing methods. Detrimental effects behind some of the polymorphic variations previously associated with Alzheimer's and breast cancer could be explained by changes in predicted splicing patterns. Conclusions We have developed SpliceScan II, an effective and sensitive tool for predicting the detrimental effects of genomic variants on splicing leading to Mendelian and complex hereditary disorders. The method could potentially be used to screen resequenced patient DNA to identify de novo mutations and polymorphic variants that could contribute to a genetic disorder.
The Evolution of Lineage-Specific Regulatory Activities in the Human Embryonic Limb

OpenAIRE

Cotney, Justin; Leng, Jing; Yin, Jun; Reilly, Steven K.; DeMare, Laura E.; Emera, Deena; Ayoub, Albert E.; Rakic, Pasko; Noonan, James P.

2013-01-01

The evolution of human anatomical features likely involved changes in gene regulation during development. However, the nature and extent of human-specific developmental regulatory functions remain unknown. We obtained a genome-wide view of cis-regulatory evolution in human embryonic tissues by comparing the histone modification H3K27ac, which provides a quantitative readout of promoter and enhancer activity, during human, rhesus, and mouse limb development. Based on increased H3K27ac, we find...
Aplikasi Ornamen Khas Maluku untuk Pengembangan Desain Motif Batik

Directory of Open Access Journals (Sweden)

Masiswo Masiswo

2016-04-01

Full Text Available ABSTRAKMaluku memiliki banyak ragam hias budaya warisan nilai leluhur berupa ornamen etnis yang merupakan kesenian dan keterampilan kerajinan. Hasil warisan tersebut sampai saat ini masih lestari hidup serta dapat dinikmati sebagai konsumsi rohani yang memuaskan manusia. Berkaitan dengan keberlangsungan nilai-nilai tradisi etnis yang berwujud pada ornamen-ornamen daerah Maluku, maka dikembangkan untuk kebutuhan manusia berupa motif batik pada kain. Pengembangan ornamen ini lebih menekankan pada representasi akan bentuk-bentuk ornamen yang diterapkan pada kerajinan batik berupa motif khas Maluku. Pengembangan alternatif desain motif batik dibuat tiga variasi yang bersumber dari ornamen khas Maluku dibuat prototipe produknya dan diuji ketahanan luntur warnanya. Hasil uji ketahanan luntur warna terhadap gosokan basah dari tiga prototipe produk berpredikat baik sekali terdapat pada “Motif Siwa” dan predikat baik pada motif “Siwa Talang” dan motif “Matahari Siwa Talang”.Kata kunci: desain, Maluku, motif batik, ornamenABSTRACTMaluku has much decorative ancestral cultural heritage value in the form of ornament ethnic arts and crafts skills. The result of the legacy is still sustainable living can be enjoyed as well as satisfying spiritual human consumption.Related to the sustainability of traditional values in the form of ethnic ornaments Maluku, it was developed for human needs in the form of batik cloth . The development of these ornaments will be more emphasis on the representation forms of ornamentation that is applied to a batik motif Maluku. Development of alternative design motif made three variations. The development of three alternative design motifs derived from the Maluku ornaments made and tested a prototype product color fastness. The test results of color fastness to wet rubbing of the three prototypes are excellent products predicated on the "Motif Siwa" and a good rating on the motif "Siwa Talang" and motif "Matahari Siwa
Parole, Sintagmatik, dan Paradigmatik Motif Batik Mega Mendung

Directory of Open Access Journals (Sweden)

Rudi - Nababan

2012-04-01

Full Text Available ABSTRACT Discussing traditional batik is related a lot to the organization system of fine arts element ac- companying it, either the pattern of the motif or the technique of the making. In this case, the motif of Mega Mendung Cirebon certainly has patterns and rules which are traditionally different from the other motifs in other areas. Through semiotics analysis especially with Saussure and Pierce concept, it can be traced that batik with Cirebon motif, in this case Mega Mendung motif, has parole and langue system, as unique fine arts language in batik, and structure of visual syntagmatic and paradigmatic. In the context of batik motif as fine arts language, it is surely related to sign system as symbol and icon. Keywords: visual semiotic, Cirebon’s batik.
Ready access to functionally embellished cis-hydrindanes and cis-decalins: protecting group-free total syntheses of (±)-Nootkatone and (±)-Noreremophilane.

Science.gov (United States)

Handore, Kishor L; Seetharamsingh, B; Reddy, D Srinivasa

2013-08-16

A simple and efficient synthesis of functionalized cis-hydrindanes and cis-decalins was achieved using a sequential Diels-Alder/aldol approach in a highly diastereoselective manner. The scope of this method was tested with a variety of substrates and was successfully applied to the synthesis of two natural products in racemic form. The highlights of the present work provide ready access to 13 new cis-hydrindanes/cis-decalins, a protecting group-free total synthesis of an insect repellent Nootkatone, and the first synthesis of a Noreremophilane using the shortest sequence.
A Comparative Study on the Origin and Variety of Motifs in Shahsavan Salt Bags and Caucasian Textiles

Directory of Open Access Journals (Sweden)

Siamak Egharloo

2017-12-01

Full Text Available Shahsavan tribes of Iran and the Caucasus region have had considerable and often inevitable intercourse and associations during the history due to their common borders and special geographical locations. The result of this has been manifested in different forms of intermingled factors and elements, specifically the textiles of tribes and ethnic groups. The interactions of the mentioned realm, i.e. textile industry, have best been appeared in patterns, motifs, colors and compositions and weaving of the hand-woven textiles among which Shahsavan "salt bags" (NAMAKDᾹN are a case in point. According to the facts and the importance of this subject, we can propose some questions as follows: What influences have the field of weaving had in these two regions as a result of their interactions and historical background? What are the motifs and their classifications in these two regions and which ones share common patterns? And which ones abound? Having been done in analytical and comparative method, the present research has examined the field of weaving in Shahsavan tribe with emphasis on its salt bags together with other Caucasian textiles (salt bags, etc.. The objectives of the research have been the study of the influences and interactions between the two regions and the recognition of patterns and motifs on their textiles. Finally, we can infer that the certain location of Iran and its common borders with the Caucasus besides tribal distribution of groups in northern and southern areas could be considered the reasons for cultural influences in the mentioned regions. The dominant motifs to be noticed here are dragons (S shape, diamonds and stars, crab-like and cross motifs as well as negative and positive spaces.
A Regulatory Network Analysis of Orphan Genes in Arabidopsis Thaliana

Science.gov (United States)

Singh, Pramesh; Chen, Tianlong; Arendsee, Zebulun; Wurtele, Eve S.; Bassler, Kevin E.

Orphan genes, which are genes unique to each particular species, have recently drawn significant attention for their potential usefulness for organismal robustness. Their origin and regulatory interaction patterns remain largely undiscovered. Recently, methods that use the context likelihood of relatedness to infer a network followed by modularity maximizing community detection algorithms on the inferred network to find the functional structure of regulatory networks were shown to be effective. We apply improved versions of these methods to gene expression data from Arabidopsis thaliana, identify groups (clusters) of interacting genes with related patterns of expression and analyze the structure within those groups. Focusing on clusters that contain orphan genes, we compare the identified clusters to gene ontology (GO) terms, regulons, and pathway designations and analyze their hierarchical structure. We predict new regulatory interactions and unravel the structure of the regulatory interaction patterns of orphan genes. Work supported by the NSF through Grants DMR-1507371 and IOS-1546858.
Optimization of Pseudomonas putida KT2440 as host for the production of cis, cis-muconate from benzoate

NARCIS (Netherlands)

Duuren, van J.B.J.H.

2011-01-01

Optimization of Pseudomonas putida KT2440 as host for the production of cis, cis-muconate

from benzoate P. putida KT2440 was used as biocatalyst given its versatile and energetically robust metabolism.

Therefore, a mutant was generated and a process developed based on which a
Regulatory elements involved in tax-mediated transactivation of the HTLV-I LTR.

Science.gov (United States)

Seeler, J S; Muchardt, C; Podar, M; Gaynor, R B

1993-10-01

HTLV-I is the etiologic agent of adult T-cell leukemia. In this study, we investigated the regulatory elements and cellular transcription factors which function in modulating HTLV-I gene expression in response to the viral transactivator protein, tax. Transfection experiments into Jurkat cells of a variety of site-directed mutants in the HTLV-1 LTR indicated that each of the three motifs A, B, and C within the 21-bp repeats, the binding sites for the Ets family of proteins, and the TATA box all influenced the degree of tax-mediated activation. Tax is also able to activate gene expression of other viral and cellular promoters. Tax activation of the IL-2 receptor and the HIV-1 LTR is mediated through NF-kappa B motifs. Interestingly, sequences in the 21-bp repeat B and C motifs contain significant homology with NF-kappa B regulatory elements. We demonstrated that an NF-kappa B binding protein, PRDII-BF1, but not the rel protein, bound to the B and C motifs in the 21-bp repeat. PRDII-BF1 was also able to stimulate activation of HTLV-I gene expression by tax. The role of the Ets proteins on modulating tax activation was also studied. Ets 1 but not Ets 2 was capable of increasing the degree of tax activation of the HTLV-I LTR. These results suggest that tax activates gene expression by either direct or indirect interaction with several cellular transcription factors that bind to the HTLV-I LTR.
Identification of a phosphorylation-dependent nuclear localization motif in interferon regulatory factor 2 binding protein 2.

Directory of Open Access Journals (Sweden)

Allen C T Teng

Full Text Available Interferon regulatory factor 2 binding protein 2 (IRF2BP2 is a muscle-enriched transcription factor required to activate vascular endothelial growth factor-A (VEGFA expression in muscle. IRF2BP2 is found in the nucleus of cardiac and skeletal muscle cells. During the process of skeletal muscle differentiation, some IRF2BP2 becomes relocated to the cytoplasm, although the functional significance of this relocation and the mechanisms that control nucleocytoplasmic localization of IRF2BP2 are not yet known.Here, by fusing IRF2BP2 to green fluorescent protein and testing a series of deletion and site-directed mutagenesis constructs, we mapped the nuclear localization signal (NLS to an evolutionarily conserved sequence (354ARKRKPSP(361 in IRF2BP2. This sequence corresponds to a classical nuclear localization motif bearing positively charged arginine and lysine residues. Substitution of arginine and lysine with negatively charged aspartic acid residues blocked nuclear localization. However, these residues were not sufficient because nuclear targeting of IRF2BP2 also required phosphorylation of serine 360 (S360. Many large-scale phosphopeptide proteomic studies had reported previously that serine 360 of IRF2BP2 is phosphorylated in numerous human cell types. Alanine substitution at this site abolished IRF2BP2 nuclear localization in C(2C(12 myoblasts and CV1 cells. In contrast, substituting serine 360 with aspartic acid forced nuclear retention and prevented cytoplasmic redistribution in differentiated C(2C(12 muscle cells. As for the effects of these mutations on VEGFA promoter activity, the S360A mutation interfered with VEGFA activation, as expected. Surprisingly, the S360D mutation also interfered with VEGFA activation, suggesting that this mutation, while enforcing nuclear entry, may disrupt an essential activation function of IRF2BP2.Nuclear localization of IRF2BP2 depends on phosphorylation near a conserved NLS. Changes in phosphorylation status
Nominal Anchors in the CIS

OpenAIRE

Peter M Keller; Thomas J Richardson

2003-01-01

Monetary policy has become increasingly important in the countries of the Commonwealth of Independent States (CIS) as fiscal adjustment and structural reforms have taken root. Inflation has been brought down to relatively low levels in almost all of these countries, raising the question of what should be the appropriate nominal anchor at this stage. Formally, almost all CIS countries have floating exchange rate regimes, yet in practice they manage their exchange rates very heavily, perhaps be...
Learning gene regulatory networks from gene expression data using weighted consensus

KAUST Repository

Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

2016-01-01

An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
Learning gene regulatory networks from gene expression data using weighted consensus

KAUST Repository

Fujii, Chisato

2016-08-25

An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.
Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes

OpenAIRE

Kreiman, Gabriel

2004-01-01

Sequence information and high‐throughput methods to measure gene expression levels open the door to explore transcriptional regulation using computational tools. Combinatorial regulation and sparseness of regulatory elements throughout the genome allow organisms to control the spatial and temporal patterns of gene expression. Here we study the organization of cis‐regulatory elements in sets of co‐regulated genes. We build an algorithm to search for combinations of transcription factor binding...
Motif statistics and spike correlations in neuronal networks

International Nuclear Information System (INIS)

Hu, Yu; Shea-Brown, Eric; Trousdale, James; Josić, Krešimir

2013-01-01

Motifs are patterns of subgraphs of complex networks. We studied the impact of such patterns of connectivity on the level of correlated, or synchronized, spiking activity among pairs of cells in a recurrent network of integrate and fire neurons. For a range of network architectures, we find that the pairwise correlation coefficients, averaged across the network, can be closely approximated using only three statistics of network connectivity. These are the overall network connection probability and the frequencies of two second order motifs: diverging motifs, in which one cell provides input to two others, and chain motifs, in which two cells are connected via a third intermediary cell. Specifically, the prevalence of diverging and chain motifs tends to increase correlation. Our method is based on linear response theory, which enables us to express spiking statistics using linear algebra, and a resumming technique, which extrapolates from second order motifs to predict the overall effect of coupling on network correlation. Our motif-based results seek to isolate the effect of network architecture perturbatively from a known network state. (paper)
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity123

Science.gov (United States)

Pecevski, Dejan

2016-01-01

Abstract Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p* that generates the examples it receives. This holds even if p* contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference. PMID:27419214
Development and utilization of complementary communication channels for treatment decision making and survivorship issues among cancer patients: The CIS Research Consortium Experience.

Science.gov (United States)

Fleisher, Linda; Wen, Kuang Yi; Miller, Suzanne M; Diefenbach, Michael; Stanton, Annette L; Ropka, Mary; Morra, Marion; Raich, Peter C

2015-11-01

Cancer patients and survivors are assuming active roles in decision-making and digital patient support tools are widely used to facilitate patient engagement. As part of Cancer Information Service Research Consortium's randomized controlled trials focused on the efficacy of eHealth interventions to promote informed treatment decision-making for newly diagnosed prostate and breast cancer patients, and post-treatment breast cancer, we conducted a rigorous process evaluation to examine the actual use of and perceived benefits of two complementary communication channels -- print and eHealth interventions. The three Virtual Cancer Information Service (V-CIS) interventions were developed through a rigorous developmental process, guided by self-regulatory theory, informed decision-making frameworks, and health communications best practices. Control arm participants received NCI print materials; experimental arm participants received the additional V-CIS patient support tool. Actual usage data from the web-based V-CIS was also obtained and reported. Print materials were highly used by all groups. About 60% of the experimental group reported using the V-CIS. Those who did use the V-CIS rated it highly on improvements in knowledge, patient-provider communication and decision-making. The findings show that how patients actually use eHealth interventions either singularly or within the context of other communication channels is complex. Integrating rigorous best practices and theoretical foundations is essential and multiple communication approaches should be considered to support patient preferences.
Network perturbation by recurrent regulatory variants in cancer.

Directory of Open Access Journals (Sweden)

Kiwon Jang

2017-03-01

Full Text Available Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes.
Novel 9-cis/all-trans β-carotene isomerases from plastidic oil bodies in Dunaliella bardawil catalyze the conversion of all-trans to 9-cis β-carotene.

Science.gov (United States)

Davidi, Lital; Pick, Uri

2017-06-01

We identified and demonstrated the function of 9-cis/all-trans β-carotene isomerases in plastidic globules of Dunaliella bardawil, the species accumulating the highest levels of 9-cis β-carotene that is essential for humans. The halotolerant alga Dunaliella bardawil is unique in that it accumulates under light stress high levels of β-carotene in plastidic lipid globules. The pigment is composed of two major isomers: all-trans β-carotene, the common natural form of this pigment, and 9-cis β-carotene. The biosynthetic pathway of β-carotene is known, but it is not clear how the 9-cis isomer is formed. We identified in plastidic lipid globules that were isolated from D. bardawil two proteins with high sequence homology to the D27 protein-a 9-cis/all-trans β-carotene isomerase from rice (Alder et al. Science 335:1348-1351, 2012). The proteins are enriched in the oil globules by 6- to 17-fold compared to chloroplast proteins. The expression of the corresponding genes, 9-cis-βC-iso1 and 9-cis-βC-iso2, is enhanced under light stress. The synthetic proteins catalyze in vitro conversion of all-trans to 9-cis β-carotene. Expression of the 9-cis-βC-iso1 or of 9-cis-βC-iso2 genes in an E. coli mutant line that harbors β-carotene biosynthesis genes enhanced the conversion of all-trans into 9-cis β-carotene. These results suggest that 9-cis-βC-ISO1 and 9-cis-βC-ISO2 proteins are responsible for the formation of 9-cis β-carotene in D. bardawil under stress conditions.

Studies on radiosensitization of Escherichia coli cells by cis-platinum complexes

International Nuclear Information System (INIS)

Zimbrick, J.D.; Sukrochana, A.; Richmond, R.C.

1979-01-01

We recently reported that the antitumor drug cis-Pt(NH 3 ) 2 Cl 2 (cis-DDP) produces significant radiosensitization of anoxic E coli C cells. We have extended these studies to three other platinum drugs, all of which have been shown to be more effective antitumor drugs than cis-DDP. The drugs are: cis-dichloro bis(ethylene imine) Pt(II) (cis-DEP); cis-dichlorobicyclopentylamine Pt(II) (cis-PAD); and Pt-thymine blue (cis-PTB). Survival curve studies indicate that these drugs all produce greater anoxic radiosensitization of E coli C than cis-DDP at concentrations which are less toxic to the cells than similar concentrations of cis-DDP. If the cells are treated with any one of these drugs for two hours and then washed to remove the drug before irradiation, no detectable radiosensitization is found. We conclude that these drugs have the potential for being useful agents in combined modality therapy and that they warrant further study in mammalian systems
cis-Acting Complex-Trait-Associated lincRNA Expression Correlates with Modulation of Chromosomal Architecture

Directory of Open Access Journals (Sweden)

Jennifer Yihong Tan

2017-02-01

Full Text Available Summary: Intergenic long noncoding RNAs (lincRNAs are the largest class of transcripts in the human genome. Although many have recently been linked to complex human traits, the underlying mechanisms for most of these transcripts remain undetermined. We investigated the regulatory roles of a high-confidence and reproducible set of 69 trait-relevant lincRNAs (TR-lincRNAs in human lymphoblastoid cells whose biological relevance is supported by their evolutionary conservation during recent human history and genetic interactions with other trait-associated loci. Their enrichment in enhancer-like chromatin signatures, interactions with nearby trait-relevant protein-coding loci, and preferential location at topologically associated domain (TAD boundaries provide evidence that TR-lincRNAs likely regulate proximal trait-relevant gene expression in cis by modulating local chromosomal architecture. This is consistent with the positive and significant correlation found between TR-lincRNA abundance and intra-TAD DNA-DNA contacts. Our results provide insights into the molecular mode of action by which TR-lincRNAs contribute to complex human traits. : Tan et al. identify and characterize 69 human complex trait/disease-associated lincRNAs in LCLs. They show that these loci are often associated with cis-regulation of gene expression and tend to be localized at TAD boundaries, suggesting that these lincRNAs may influence chromosomal architecture. Keywords: intergenic long noncoding RNA, lincRNA, GWAS, expression quantitative trait loci, eQTL, complex trait and disease, enhancer, cis-regulation, topologically associated domains, TAD
In silico Analysis of osr40c1 Promoter Sequence Isolated from Indica Variety Pokkali

OpenAIRE

W.S.I. de Silva; M.M.N. Perera; K.L.N.S. Perera; A.M. Wickramasuriya; G.A.U. Jayasekera

2017-01-01

The promoter region of a drought and abscisic acid (ABA) inducible gene, osr40c1, was isolated from a salt-tolerant indica rice variety Pokkali, which is 670 bp upstream of the putative translation start codon. In silico promoter analysis of resulted sequence showed that at least 15 types of putative motifs were distributed within the sequence, including two types of common promoter elements, TATA and CAAT boxes. Additionally, several putative cis-acing regulatory elements which may be involv...
Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis.

Science.gov (United States)

Ma, Chuang; Wang, Xiangfeng

2012-09-01

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.
Controllability analysis of transcriptional regulatory networks reveals circular control patterns among transcription factors

DEFF Research Database (Denmark)

Österlund, Tobias; Bordel, Sergio; Nielsen, Jens

2015-01-01

% for the human network. The high controllability (low number of drivers needed to control the system) in yeast, mouse and human is due to the presence of internal loops in their regulatory networks where the TFs regulate each other in a circular fashion. We refer to these internal loops as circular control...... motifs (CCM). The E. coli transcriptional regulatory network, which does not have any CCMs, shows a hierarchical structure of the transcriptional regulatory network in contrast to the eukaryal networks. The presence of CCMs also has influence on the stability of these networks, as the presence of cycles...
Probabilistic Inference in General Graphical Models through Sampling in Stochastic Networks of Spiking Neurons

Science.gov (United States)

Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang

2011-01-01

An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717
Probabilistic inference in general graphical models through sampling in stochastic networks of spiking neurons.

Directory of Open Access Journals (Sweden)

Dejan Pecevski

2011-12-01

Full Text Available An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away" and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.
Probabilistic inference in general graphical models through sampling in stochastic networks of spiking neurons.

Science.gov (United States)

Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang

2011-12-01

An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away") and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.
CONTEMPORARY USAGE OF TRADITIONAL TURKISH MOTIFS IN PRODUCT DESIGNS

Directory of Open Access Journals (Sweden)

Tulay Gumuser

2012-12-01

Full Text Available The aim of this study is to identify the traditional Turkish motifs and its relations among present industrial designs. Traditional Turkish motifs played a very important role in 16th century onwards. The arts of the Ottoman Empire were used because of their symbolic meanings and unique styles. When we examine these motifs we encounter; Tiger Stripe, Three Spot (Çintemani, Rumi, Hatayi, Penç, Cloud, Crescent, Star, Crown, Hyacinth, Tulip and Carnation motifs. Nowadays, Turkish designers have begun to use these traditional Turkish motifs in their designs so as to create differences and awareness in the world design. The examples of these industrial designs, using the Turkish motifs, have survived and have Ottoman heritage and historical value. In this study, the Turkish motifs will be examined along with their focus on contemporary Turkish industrial designs used today.
Computational methods in sequence and structure prediction

Science.gov (United States)

Lang, Caiyi

This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed
RNA motif search with data-driven element ordering.

Science.gov (United States)

Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

2016-05-18

In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .
Uncovering transcriptional interactions via an adaptive fuzzy logic approach

Directory of Open Access Journals (Sweden)

Chen Chung-Ming

2009-12-01

Full Text Available Abstract Background To date, only a limited number of transcriptional regulatory interactions have been uncovered. In a pilot study integrating sequence data with microarray data, a position weight matrix (PWM performed poorly in inferring transcriptional interactions (TIs, which represent physical interactions between transcription factors (TF and upstream sequences of target genes. Inferring a TI means that the promoter sequence of a target is inferred to match the consensus sequence motifs of a potential TF, and their interaction type such as AT or RT is also predicted. Thus, a robust PWM (rPWM was developed to search for consensus sequence motifs. In addition to rPWM, one feature extracted from ChIP-chip data was incorporated to identify potential TIs under specific conditions. An interaction type classifier was assembled to predict activation/repression of potential TIs using microarray data. This approach, combining an adaptive (learning fuzzy inference system and an interaction type classifier to predict transcriptional regulatory networks, was named AdaFuzzy. Results AdaFuzzy was applied to predict TIs using real genomics data from Saccharomyces cerevisiae. Following one of the latest advances in predicting TIs, constrained probabilistic sparse matrix factorization (cPSMF, and using 19 transcription factors (TFs, we compared AdaFuzzy to four well-known approaches using over-representation analysis and gene set enrichment analysis. AdaFuzzy outperformed these four algorithms. Furthermore, AdaFuzzy was shown to perform comparably to 'ChIP-experimental method' in inferring TIs identified by two sets of large scale ChIP-chip data, respectively. AdaFuzzy was also able to classify all predicted TIs into one or more of the four promoter architectures. The results coincided with known promoter architectures in yeast and provided insights into transcriptional regulatory mechanisms. Conclusion AdaFuzzy successfully integrates multiple types of
The human Ago2 MC region does not contain an eIF4E-like mRNA cap binding motif

Directory of Open Access Journals (Sweden)

Grishin Nick V

2009-01-01

Full Text Available Abstract Background Argonaute (Ago proteins interact with small regulatory RNAs to mediate gene regulatory pathways. A recent report by Kiriakidou et al. 1 describes an MC sequence region identified in Ago2 that displays similarity to the cap-binding motif in translation initiation factor 4E (eIF4E. In a cap-bound eIF4E structure, two important aromatic residues of the motif stack on either side of a 7-methylguanosine 5'-triphosphate (m7Gppp base. The corresponding Ago2 aromatic residues (F450 and F505 were hypothesized to perform the same cap-binding function. However, the detected similarity between the MC sequence and the eIF4E cap-binding motif was questionable. Results A number of sequence-based and structure-based bioinformatics methods reveal the reported similarity between the Ago2 MC sequence region and the eIF4E cap-binding motif to be spurious. Alternatively, the MC sequence region is confidently assigned to the N-terminus of the Ago piwi module, within the mid domain of experimentally determined prokaryotic Ago structures. Confident mapping of the Ago2 MC sequence region to the piwi mid domain results in a homology-based structure model that positions the identified aromatic residues over 20 Å apart, with one of the aromatic side chains (F450 contributing instead to the hydrophobic core of the domain. Conclusion Correct functional prediction based on weak sequence similarity requires substantial evolutionary and structural support. The evolutionary context of the Ago mid domain suggested by multiple sequence alignment is limited to a conserved hydrophobicity profile required for the fold and a motif following the MC region that binds guide RNA. Mapping of the MC sequence to the mid domain structure reveals Ago2 aromatics that are incompatible with eIF4E-like mRNA cap-binding, yet display some limited local structure similarities that cause the chance sequence match to eIF4E. Reviewers This article was reviewed by Arcady Mushegian
Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

Energy Technology Data Exchange (ETDEWEB)

Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

2005-12-01

Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.
Identification of sequence motifs significantly associated with antisense activity

Directory of Open Access Journals (Sweden)

Peek Andrew S

2007-06-01

Full Text Available Abstract Background Predicting the suppression activity of antisense oligonucleotide sequences is the main goal of the rational design of nucleic acids. To create an effective predictive model, it is important to know what properties of an oligonucleotide sequence associate significantly with antisense activity. Also, for the model to be efficient we must know what properties do not associate significantly and can be omitted from the model. This paper will discuss the results of a randomization procedure to find motifs that associate significantly with either high or low antisense suppression activity, analysis of their properties, as well as the results of support vector machine modelling using these significant motifs as features. Results We discovered 155 motifs that associate significantly with high antisense suppression activity and 202 motifs that associate significantly with low suppression activity. The motifs range in length from 2 to 5 bases, contain several motifs that have been previously discovered as associating highly with antisense activity, and have thermodynamic properties consistent with previous work associating thermodynamic properties of sequences with their antisense activity. Statistical analysis revealed no correlation between a motif's position within an antisense sequence and that sequences antisense activity. Also, many significant motifs existed as subwords of other significant motifs. Support vector regression experiments indicated that the feature set of significant motifs increased correlation compared to all possible motifs as well as several subsets of the significant motifs. Conclusion The thermodynamic properties of the significantly associated motifs support existing data correlating the thermodynamic properties of the antisense oligonucleotide with antisense efficiency, reinforcing our hypothesis that antisense suppression is strongly associated with probe/target thermodynamics, as there are no enzymatic
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.

Science.gov (United States)

Alvadia, Carolina M; Sommer, Theis; Bjerregaard-Andersen, Kaare; Damkier, Helle Hasager; Montrasio, Michele; Aalkjaer, Christian; Morth, J Preben

2017-09-21

The sodium-driven chloride/bicarbonate exchanger (NDCBE) is essential for maintaining homeostatic pH in neurons. The crystal structure at 2.8 Å resolution of the regulatory N-terminal domain of human NDCBE represents the first crystal structure of an electroneutral sodium-bicarbonate cotransporter. The crystal structure forms an equivalent dimeric interface as observed for the cytoplasmic domain of Band 3, and thus establishes that the consensus motif VTVLP is the key minimal dimerization motif. The VTVLP motif is highly conserved and likely to be the physiologically relevant interface for all other members of the SLC4 family. A novel conserved Zn 2+ -binding motif present in the N-terminal domain of NDCBE is identified and characterized in vitro. Cellular studies confirm the Zn 2+ dependent transport of two electroneutral bicarbonate transporters, NCBE and NBCn1. The Zn 2+ site is mapped to a cluster of histidines close to the conserved ETARWLKFEE motif and likely plays a role in the regulation of this important motif. The combined structural and bioinformatics analysis provides a model that predicts with additional confidence the physiologically relevant interface between the cytoplasmic domain and the transmembrane domain.
Integrated Approach to Reconstruction of Microbial Regulatory Networks

Energy Technology Data Exchange (ETDEWEB)

Rodionov, Dmitry A [Sanford-Burnham Medical Research Institute; Novichkov, Pavel S [Lawrence Berkeley National Laboratory

2013-11-04

This project had the goal(s) of development of integrated bioinformatics platform for genome-scale inference and visualization of transcriptional regulatory networks (TRNs) in bacterial genomes. The work was done in Sanford-Burnham Medical Research Institute (SBMRI, P.I. D.A. Rodionov) and Lawrence Berkeley National Laboratory (LBNL, co-P.I. P.S. Novichkov). The developed computational resources include: (1) RegPredict web-platform for TRN inference and regulon reconstruction in microbial genomes, and (2) RegPrecise database for collection, visualization and comparative analysis of transcriptional regulons reconstructed by comparative genomics. These analytical resources were selected as key components in the DOE Systems Biology KnowledgeBase (SBKB). The high-quality data accumulated in RegPrecise will provide essential datasets of reference regulons in diverse microbes to enable automatic reconstruction of draft TRNs in newly sequenced genomes. We outline our progress toward the three aims of this grant proposal, which were: Develop integrated platform for genome-scale regulon reconstruction; Infer regulatory annotations in several groups of bacteria and building of reference collections of microbial regulons; and Develop KnowledgeBase on microbial transcriptional regulation.
Analisis Unsur Matematika pada Motif Sulam Usus

Directory of Open Access Journals (Sweden)

Fredi Ganda Putra

2017-12-01

Full Text Available Based on interviews with researchers sources said that the beginning of the intestine embroidery is an art of genuine crafts. Called the intestine embroidery because this technique is a technique of combining a strand of cloth resembling the intestine formed according to the pattern by means of embroidered using a thread. Intestinal embroidery techniques were originally used to create a cover of the women's customary wardrobe of Lampung or often referred to as bebe. But not many people in Lampung, especially people who live in Lampung are still many who do not know and recognize the intestine embroidery because most only know tapis only characteristic of Lampung, besides that there are other cultural results that is embroidered intestine. There are still many who do not know that the intestine motif there is a knowledge of mathematics. The researcher's problem formulation is whether there are mathematical elements contained in the intestine embroidery motif based on the concept of geometry. The purpose of this study is to determine whether there are elements of mathematics contained in the intestine motif based on the concept of geometry. Subjects in this study consisted of 4 people obtained by purposive sampling technique. From the results of data analysis conducted by using descriptive analysis and discussion as follows: (1 Intestinal embroidery motif contains the meaning of mathematics and culture or often called Etnomatematika. On the meaning of culture there is a link between the embroidery intestine with a culture that has been there before as the existence of cultural linkage between Hindu belief Buddhism and there are similarities of motifs and decorative patterns contained in the motif embroidery intestine with ornamental variety in Indonesia. (2 The relationship between the intestine with mathematical motifs there are elements of mathematics such as geometry elements in the form of geometry of dimension one and dimension two, and the
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

Science.gov (United States)

Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

2000-01-01

Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Motif signatures of transcribed enhancers

KAUST Repository

Kleftogiannis, Dimitrios

2017-09-14

In mammalian cells, transcribed enhancers (TrEn) play important roles in the initiation of gene expression and maintenance of gene expression levels in spatiotemporal manner. One of the most challenging questions in biology today is how the genomic characteristics of enhancers relate to enhancer activities. This is particularly critical, as several recent studies have linked enhancer sequence motifs to specific functional roles. To date, only a limited number of enhancer sequence characteristics have been investigated, leaving space for exploring the enhancers genomic code in a more systematic way. To address this problem, we developed a novel computational method, TELS, aimed at identifying predictive cell type/tissue specific motif signatures. We used TELS to compile a comprehensive catalog of motif signatures for all known TrEn identified by the FANTOM5 consortium across 112 human primary cells and tissues. Our results confirm that distinct cell type/tissue specific motif signatures characterize TrEn. These signatures allow discriminating successfully a) TrEn from random controls, proxy of non-enhancer activity, and b) cell type/tissue specific TrEn from enhancers expressed and transcribed in different cell types/tissues. TELS codes and datasets are publicly available at http://www.cbrc.kaust.edu.sa/TELS.

Genome-wide analysis and expression profiling of the GRF gene family in oilseed rape (Brassica napus L.).

Science.gov (United States)

Ma, Jin-Qi; Jian, Hong-Ju; Yang, Bo; Lu, Kun; Zhang, Ao-Xiang; Liu, Pu; Li, Jia-Na

2017-07-15

Growth regulating-factors (GRFs) are plant-specific transcription factors that help regulate plant growth and development. Genome-wide identification and evolutionary analyses of GRF gene families have been performed in Arabidopsis thaliana, Zea mays, Oryza sativa, and Brassica rapa, but a comprehensive analysis of the GRF gene family in oilseed rape (Brassica napus) has not yet been reported. In the current study, we identified 35 members of the BnGRF family in B. napus. We analyzed the chromosomal distribution, phylogenetic relationships (Bayesian Inference and Neighbor Joining method), gene structures, and motifs of the BnGRF family members, as well as the cis-acting regulatory elements in their promoters. We also analyzed the expression patterns of 15 randomly selected BnGRF genes in various tissues and in plant varieties with different harvest indices and gibberellic acid (GA) responses. The expression levels of BnGRFs under GA treatment suggested the presence of possible negative feedback regulation. The evolutionary patterns and expression profiles of BnGRFs uncovered in this study increase our understanding of the important roles played by these genes in oilseed rape. Copyright © 2017. Published by Elsevier B.V.
Identifying noncoding risk variants using disease-relevant gene regulatory networks.

Science.gov (United States)

Gao, Long; Uzun, Yasin; Gao, Peng; He, Bing; Ma, Xiaoke; Wang, Jiahui; Han, Shizhong; Tan, Kai

2018-02-16

Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.
Triadic motifs in the dependence networks of virtual societies

Science.gov (United States)

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies.

Science.gov (United States)

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-10

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Two potential hookworm DAF-16 target genes, SNR-3 and LPP-1: gene structure, expression profile, and implications of a cis-regulatory element in the regulation of gene expression.

Science.gov (United States)

Gao, Xin; Goggin, Kevin; Dowling, Camille; Qian, Jason; Hawdon, John M

2015-01-08

Hookworms infect nearly 700 million people, causing anemia and developmental stunting in heavy infections. Little is known about the genomic structure or gene regulation in hookworms, although recent publication of draft genome assemblies has allowed the first investigations of these topics to be undertaken. The transcription factor DAF-16 mediates multiple developmental pathways in the free living nematode Caenorhabditis elegans, and is involved in the recovery from the developmentally arrested L3 in hookworms. Identification of downstream targets of DAF-16 will provide a better understanding of the molecular mechanism of hookworm infection. Genomic Fragment 2.23 containing a DAF-16 binding element (DBE) was used to identify overlapping complementary expressed sequence tags (ESTs). These sequences were used to search a draft assembly of the Ancylostoma caninum genome, and identified two neighboring genes, snr-3 and lpp-1, in a tail-to-tail orientation. Expression patterns of both genes during parasitic development were determined by qRT-PCR. DAF-16 dependent cis-regulatory activity of fragment 2.23 was investigated using an in vitro reporter system. The snr-3 gene spans approximately 5.6 kb in the genome and contains 3 exons and 2 introns, and contains the DBE in its 3' untranslated region. Downstream from snr-3 in a tail-to-tail arrangement is the gene lpp-1. The lpp-1 gene spans more than 6 kb and contains 10 exons and 9 introns. The A. caninum genome contains 2 apparent splice variants, but there are 7 splice variants in the A. ceylanicum genome. While the gene order is similar, the gene structures of the hookworm genes differ from their C. elegans orthologs. Both genes show peak expression in the late L4 stage. Using a cell culture based expression system, fragment 2.23 was found to have both DAF-16-dependent promoter and enhancer activity that required an intact DBE. Two putative DAF-16 targets were identified by genome wide screening for DAF-16 binding
Stepwise encapsulation and controlled two-stage release system for cis-Diamminediiodoplatinum.

Science.gov (United States)

Chen, Yun; Li, Qian; Wu, Qingsheng

2014-01-01

cis-Diamminediiodoplatinum (cis-DIDP) is a cisplatin-like anticancer drug with higher anticancer activity, but lower stability and price than cisplatin. In this study, a cis-DIDP carrier system based on micro-sized stearic acid was prepared by an emulsion solvent evaporation method. The maximum drug loading capacity of cis-DIDP-loaded solid lipid nanoparticles was 22.03%, and their encapsulation efficiency was 97.24%. In vitro drug release in phosphate-buffered saline (pH =7.4) at 37.5°C exhibited a unique two-stage process, which could prove beneficial for patients with tumors and malignancies. MTT (3-[4,5-dimethylthiazol-2-yl]-2, 5-diphenyltetrazolium bromide) assay results showed that cis-DIDP released from cis-DIDP-loaded solid lipid nanoparticles had better inhibition activity than cis-DIDP that had not been loaded.
Efficient motif finding algorithms for large-alphabet inputs

Directory of Open Access Journals (Sweden)

Pavlovic Vladimir

2010-10-01

Full Text Available Abstract Background We consider the problem of identifying motifs, recurring or conserved patterns, in the biological sequence data sets. To solve this task, we present a new deterministic algorithm for finding patterns that are embedded as exact or inexact instances in all or most of the input strings. Results The proposed algorithm (1 improves search efficiency compared to existing algorithms, and (2 scales well with the size of alphabet. On a synthetic planted DNA motif finding problem our algorithm is over 10× more efficient than MITRA, PMSPrune, and RISOTTO for long motifs. Improvements are orders of magnitude higher in the same setting with large alphabets. On benchmark TF-binding site problems (FNP, CRP, LexA we observed reduction in running time of over 12×, with high detection accuracy. The algorithm was also successful in rapidly identifying protein motifs in Lipocalin, Zinc metallopeptidase, and supersecondary structure motifs for Cadherin and Immunoglobin families. Conclusions Our algorithm reduces computational complexity of the current motif finding algorithms and demonstrate strong running time improvements over existing exact algorithms, especially in important and difficult cases of large-alphabet sequences.
Relative Stability of cis- and trans-Hydrindanones

Directory of Open Access Journals (Sweden)

Motoo Tori

2015-01-01

Full Text Available The relative stabilities of several cis- and trans-hydrindanones were compared using both isomerization experiments and MM2 calculations. The generally believed rule that cis-hydrindanones are more stable than trans-isomers is applicable, but is not always true. This review introduces examples, mainly from studies in our laboratory, to explain these facts.
Discovery of candidate KEN-box motifs using cell cycle keyword enrichment combined with native disorder prediction and motif conservation.

Science.gov (United States)

Michael, Sushama; Travé, Gilles; Ramu, Chenna; Chica, Claudia; Gibson, Toby J

2008-02-15

KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database-implying that KEN-boxes might be more common than reported. Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research.
Cloning and functional analysis of the promoters that upregulate carotenogenic gene expression during flower development in Gentiana lutea.

Science.gov (United States)

Zhu, Changfu; Yang, Qingjie; Ni, Xiuzhen; Bai, Chao; Sheng, Yanmin; Shi, Lianxuan; Capell, Teresa; Sandmann, Gerhard; Christou, Paul

2014-04-01

Over the last two decades, many carotenogenic genes have been cloned and used to generate metabolically engineered plants producing higher levels of carotenoids. However, comparatively little is known about the regulation of endogenous carotenogenic genes in higher plants, and this restricts our ability to predict how engineered plants will perform in terms of carotenoid content and composition. During petal development in the Great Yellow Gentian (Gentiana lutea), carotenoid accumulation, the formation of chromoplasts and the upregulation of several carotenogenic genes are temporally coordinated. We investigated the regulatory mechanisms responsible for this coordinated expression by isolating five G. lutea carotenogenic gene (GlPDS, GlZDS, GlLYCB, GlBCH and GlLYCE) promoters by inverse polymerase chain reaction (PCR). Each promoter was sufficient for developmentally regulated expression of the gusA reporter gene following transient expression in tomato (Solanum lycopersicum cv. Micro-Tom). Interestingly, the GlLYCB and GlBCH promoters drove high levels of gusA expression in chromoplast-containing mature green fruits, but low levels in chloroplast-containing immature green fruits, indicating a strict correlation between promoter activity, tomato fruit development and chromoplast differentiation. As well as core promoter elements such as TATA and CAAT boxes, all five promoters together with previously characterized GlZEP promoter contained three common cis-regulatory motifs involved in the response to methyl jasmonate (CGTCA) and ethylene (ATCTA), and required for endosperm expression (Skn-1_motif, GTCAT). These shared common cis-acting elements may represent binding sites for transcription factors responsible for co-regulation. Our data provide insight into the regulatory basis of the coordinated upregulation of carotenogenic gene expression during flower development in G. lutea. © 2013 Scandinavian Plant Physiology Society.
The Community Intercomparison Suite (CIS)

Science.gov (United States)

Watson-Parris, Duncan; Schutgens, Nick; Cook, Nick; Kipling, Zak; Kershaw, Phil; Gryspeerdt, Ed; Lawrence, Bryan; Stier, Philip

2017-04-01

Earth observations (both remote and in-situ) create vast amounts of data providing invaluable constraints for the climate science community. Efficient exploitation of these complex and highly heterogeneous datasets has been limited however by the lack of suitable software tools, particularly for comparison of gridded and ungridded data, thus reducing scientific productivity. CIS (http://cistools.net) is an open-source, command line tool and Python library which allows the straight-forward quantitative analysis, intercomparison and visualisation of remote sensing, in-situ and model data. The CIS can read gridded and ungridded remote sensing, in-situ and model data - and many other data sources 'out-of-the-box', such as ESA Aerosol and Cloud CCI product, MODIS, Cloud CCI, Cloudsat, AERONET. Perhaps most importantly however CIS also employs a modular plugin architecture to allow for the reading of limitless different data types. Users are able to write their own plugins for reading the data sources which they are familiar with, and share them within the community, allowing all to benefit from their expertise. To enable the intercomparison of this data the CIS provides a number of operations including: the aggregation of ungridded and gridded datasets to coarser representations using a number of different built in averaging kernels; the subsetting of data to reduce its extent or dimensionality; the co-location of two distinct datasets onto a single set of co-ordinates; the visualisation of the input or output data through a number of different plots and graphs; the evaluation of arbitrary mathematical expressions against any number of datasets; and a number of other supporting functions such as a statistical comparison of two co-located datasets. These operations can be performed efficiently on local machines or large computing clusters - and is already available on the JASMIN computing facility. A case-study using the GASSP collection of in-situ aerosol observations
Optimization of Pseudomonas putida KT2440 as host for the production of cis, cis-muconate from benzoate

OpenAIRE

Duuren, van, J.B.J.H.

2011-01-01

Optimization of Pseudomonas putida KT2440 as host for the production of cis, cis-muconate from benzoate P. putida KT2440 was used as biocatalyst given its versatile and energetically robust metabolism. Therefore, a mutant was generated and a process developed based on which a life cycle assessment (LCA) was performed. Additionally, the growth related parameters were experimentally obtained to constrain the metabolic model iJP815 further. The mutant Pseudomonas putida KT2440-JD1 was deri...
Activity of the rat osteocalcin basal promoter in osteoblastic cells is dependent upon homeodomain and CP1 binding motifs.

Science.gov (United States)

Towler, D A; Bennett, C D; Rodan, G A

1994-05-01

A detailed analysis of the transcriptional machinery responsible for osteoblast-specific gene expression should provide tools useful for understanding osteoblast commitment and differentiation. We have defined three cis-elements important for basal activity of the rat osteocalcin (OC) promoter, located at about -200 to -180, -170 to -138, and -121 to -64 relative to the transcription initiation site. A motif (TCTGATTGTGT) present in the region between -200 and -170 that binds a multisubunit CP1/NFY/CBF-like CAAT factor complex contributes significantly to high level basal activity and presumably functions as the CAAT box for the rat OC promoter. We show that the region -121 to 32 is sufficient to confer osteoblastic cell type specificity in transient transfection assays of cultured cell lines using luciferase as a reporter. The basal promoter is active in rodent osteoblastic cell lines, but not in rodent fibroblastic or muscle cell lines. Although the rat OC box (-100 to -74) contains a CAAT motif, we could not detect CP1-like CAAT factor binding to this region. In fact, we demonstrate that a Msx-1 (Hox 7.1) homeodomain binding motif (ACTAATTG; bottom strand) in the 3'-end of the rat OC box is necessary for high level activity of the rat OC basal promoter in osteoblastic cells. A nuclear factor that recognizes this motif appears to be present in osteoblastic ROS 17/2.8 cells, which produce OC, but not in fibroblastic ROS 25/1 cells, which fail to express OC. This ROS 17/2.8 nuclear factor also recognizes the A/T-rich DNA cognates of the homeodomain-containing POU family of transcription factors. Taken together, these data suggest that a ubiquitous CP1-like CAAT factor and a cell type-restricted homeodomain containing (Msx or POU family) transcription factor interact with the proximal rat OC promoter to direct appropriate basal OC transcription in osteoblastic cells.
DNA motif elucidation using belief propagation.

Science.gov (United States)

Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

2013-09-01

Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.
DNA motif elucidation using belief propagation

KAUST Repository

Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

2013-01-01

Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ?10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/?wkc/kmerHMM. 2013 The Author(s).
DNA motif elucidation using belief propagation

KAUST Repository

Wong, Ka-Chun

2013-06-29

Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ?10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors\\' websites: e.g. http://www.cs.toronto.edu/?wkc/kmerHMM. 2013 The Author(s).
Principal component analysis for predicting transcription-factor binding motifs from array-derived data

Directory of Open Access Journals (Sweden)

Vincenti Matthew P

2005-11-01

Full Text Available Abstract Background The responses to interleukin 1 (IL-1 in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs. In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. Results The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC-3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3' were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC-BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPARγ, STAF, ROAZ, and NFκB, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. Conclusion The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences.
Chiral synthesis of (Z)-3-cis-6,7-cis-9,10-diepoxyhenicosenes, sex pheromone components of the satin moth, Leucoma salicis.

Science.gov (United States)

Wimalaratne, Priyantha D C; Slessor, Keith N

2004-06-01

All four isomers of (Z)-3-cis-6,7-cis-9,10-diepoxyhenicosenes, 1-4, have been synthesized using D-xylose as the chirally pure starting material. D-Xylose was first converted to 2-deoxy-4,5-O-isopropylidene-3-t-butyldimethylsilyl-D-threopentose 11, via several steps of selective protection, dehydroxylation, and deprotection. Wittig coupling of 11 with nonyltriphenylphosphonium bromide followed by hydrogenation and acid catalyzed deprotection of hydroxyl groups yielded the chiral (2R,3R)-1,2,3-triol, 14, which was used as the precursor for the C-8 to C-21 unit of the (Z)-3-cis-6,7-cis-9,10-diepoxyhenicosenes. Selective tosylation of 14 followed by stereospecific cyclization yielded (2R,3R)-1,2-epoxytetradecan-3-ol, 16, which was then divergently converted to the t-butyldimethylsilyl ether 17 and tosylate 22, respectively. Establishment of the C-5 through C-7 unit of the target molecules was accomplished via regiospecific coupling of 17 with 1-t-butyldimethylsiloxy-2-propyne to form 18. Stepwise transformation of 18 via the formation of tosylate 19, desilylation, and stereospecific cyclization to form epoxy alcohol 20, followed by P2-Ni reduction yielded a key intermediate, allylic epoxy alcohol (Z)-2-(5S,6R)-cis-5,6-epoxyheptadecen-1-ol, 21. Similarly, the coupling of 22 with 1-t-butyldimethylsiloxy-2-propyne yielded 23, which was stereospecifically cyclized to form 24. Desilylation and P2-Ni reduction of 24 gave the antipodal intermediate, (Z)-2-(5R,6S)-cis-5,6-epoxyheptadecen-1-ol, 26. Asymmetric epoxidation of antipodes 21 and 26 with (L)- or (D)-diethyl tartrates resulted in the formation of diepoxy alcohols 27 and 29 from 21, and 33 and 31 from 26, respectively. Tosylation of these diepoxy alcohols followed by coupling with lithium dibutenyl cuprate yielded the four stereoisomers of (Z)-3-cis-6,7-cis-9,10-diepoxyhenicosenes, 1-4. Analysis of the retention characteristics of these materials revealed that one or both of the S*,R*,S*,R* stereoisomers comprise the
Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

Science.gov (United States)

Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

2017-10-06

Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Causality analysis detects the regulatory role of maternal effect genes in the early Drosophila embryo

Directory of Open Access Journals (Sweden)

Zara Ghodsi

2017-03-01

Full Text Available In developmental studies, inferring regulatory interactions of segmentation genetic network play a vital role in unveiling the mechanism of pattern formation. As such, there exists an opportune demand for theoretical developments and new mathematical models which can result in a more accurate illustration of this genetic network. Accordingly, this paper seeks to extract the meaningful regulatory role of the maternal effect genes using a variety of causality detection techniques and to explore whether these methods can suggest a new analytical view to the gene regulatory networks. We evaluate the use of three different powerful and widely-used models representing time and frequency domain Granger causality and convergent cross mapping technique with the results being thoroughly evaluated for statistical significance. Our findings show that the regulatory role of maternal effect genes is detectable in different time classes and thereby the method is applicable to infer the possible regulatory interactions present among the other genes of this network.

Comprehensive meta-analysis of Signal Transducers and Activators of Transcription (STAT genomic binding patterns discerns cell-specific cis-regulatory modules

Directory of Open Access Journals (Sweden)

Kang Keunsoo

2013-01-01

Full Text Available Abstract Background Cytokine-activated transcription factors from the STAT (Signal Transducers and Activators of Transcription family control common and context-specific genetic programs. It is not clear to what extent cell-specific features determine the binding capacity of seven STAT members and to what degree they share genetic targets. Molecular insight into the biology of STATs was gained from a meta-analysis of 29 available ChIP-seq data sets covering genome-wide occupancy of STATs 1, 3, 4, 5A, 5B and 6 in several cell types. Results We determined that the genomic binding capacity of STATs is primarily defined by the cell type and to a lesser extent by individual family members. For example, the overlap of shared binding sites between STATs 3 and 5 in T cells is greater than that between STAT5 in T cells and non-T cells. Even for the top 1,000 highly enriched STAT binding sites, ~15% of STAT5 binding sites in mouse female liver are shared by other STATs in different cell types while in T cells ~90% of STAT5 binding sites are co-occupied by STAT3, STAT4 and STAT6. In addition, we identified 116 cis-regulatory modules (CRM, which are recognized by all STAT members across cell types defining a common JAK-STAT signature. Lastly, in liver STAT5 binding significantly coincides with binding of the cell-specific transcription factors HNF4A, FOXA1 and FOXA2 and is associated with cell-type specific gene transcription. Conclusions Our results suggest that genomic binding of STATs is primarily determined by the cell type and further specificity is achieved in part by juxtaposed binding of cell-specific transcription factors.
Identification and characterization of promoters and cis-regulatory elements of genes involved in secondary metabolites production in hop (Humulus lupulus. L)

Czech Academy of Sciences Publication Activity Database

Duraisamy, Ganesh Selvaraj; Mishra, Ajay Kumar; Kocábek, Tomáš; Matoušek, Jaroslav

2016-01-01

Roč. 84, October (2016), s. 346-352 ISSN 1476-9271 R&D Projects: GA ČR GA13-03037S Institutional support: RVO:60077344 Keywords : Cis-acting elements * Gene regulation * Humulus lupulus Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.331, year: 2016
Hybrid DNA i-motif: Aminoethylprolyl-PNA (pC5) enhance the stability of DNA (dC5) i-motif structure.

Science.gov (United States)

Gade, Chandrasekhar Reddy; Sharma, Nagendra K

2017-12-15

This report describes the synthesis of C-rich sequence, cytosine pentamer, of aep-PNA and its biophysical studies for the formation of hybrid DNA:aep-PNAi-motif structure with DNA cytosine pentamer (dC 5 ) under acidic pH conditions. Herein, the CD/UV/NMR/ESI-Mass studies strongly support the formation of stable hybrid DNA i-motif structure with aep-PNA even near acidic conditions. Hence aep-PNA C-rich sequence cytosine could be considered as potential DNA i-motif stabilizing agents in vivo conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.
MIR@NT@N: a framework integrating transcription factors, microRNAs and their targets to identify sub-network motifs in a meta-regulation network model

Directory of Open Access Journals (Sweden)

Wasserman Wyeth W

2011-03-01

Full Text Available Abstract Background To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs, microRNAs (miRNAs and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regulatory molecular interactions at a genome-wide scale would be of great interest to biologists, but is not available to date. Results To identify and analyze molecular interaction networks, we developed MIR@NT@N, an integrative approach based on a meta-regulation network model and a large-scale database. MIR@NT@N uses a graph-based approach to predict novel molecular actors across multiple regulatory processes (i.e. TFs acting on protein-coding or miRNA genes, or miRNAs acting on messenger RNAs. Exploiting these predictions, the user can generate networks and further analyze them to identify sub-networks, including motifs such as feedback and feedforward loops (FBL and FFL. In addition, networks can be built from lists of molecular actors with an a priori role in a given biological process to predict novel and unanticipated interactions. Analyses can be contextualized and filtered by integrating additional information such as microarray expression data. All results, including generated graphs, can be visualized, saved and exported into various formats. MIR@NT@N performances have been evaluated using published data and then applied to the regulatory program underlying epithelium to mesenchyme transition (EMT, an evolutionary-conserved process which is implicated in embryonic development and disease. Conclusions MIR@NT@N is an effective computational approach to identify novel molecular regulations and to predict gene regulatory networks and sub-networks including conserved motifs within a given biological context. Taking advantage of the M@IA environment, MIR@NT@N is a user-friendly web resource freely available at http
Automatic annotation of protein motif function with Gene Ontology terms

Directory of Open Access Journals (Sweden)

Gopalakrishnan Vanathi

2004-09-01

Full Text Available Abstract Background Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. Results This paperpresents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifsis viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association isfound to be a very useful feature. We take advantageof the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correctassociation. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. Conclusions In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about thefunctions of newly discovered candidate protein motifs.
Verification of the MOTIF code version 3.0

International Nuclear Information System (INIS)

Chan, T.; Guvanasen, V.; Nakka, B.W.; Reid, J.A.K.; Scheier, N.W.; Stanchell, F.W.

1996-12-01

As part of the Canadian Nuclear Fuel Waste Management Program (CNFWMP), AECL has developed a three-dimensional finite-element code, MOTIF (Model Of Transport In Fractured/ porous media), for detailed modelling of groundwater flow, heat transport and solute transport in a fractured rock mass. The code solves the transient and steady-state equations of groundwater flow, solute (including one-species radionuclide) transport, and heat transport in variably saturated fractured/porous media. The initial development was completed in 1985 (Guvanasen 1985) and version 3.0 was completed in 1986. This version is documented in detail in Guvanasen and Chan (in preparation). This report describes a series of fourteen verification cases which has been used to test the numerical solution techniques and coding of MOTIF, as well as demonstrate some of the MOTIF analysis capabilities. For each case the MOTIF solution has been compared with a corresponding analytical or independently developed alternate numerical solution. Several of the verification cases were included in Level 1 of the International Hydrologic Code Intercomparison Project (HYDROCOIN). The MOTIF results for these cases were also described in the HYDROCOIN Secretariat's compilation and comparison of results submitted by the various project teams (Swedish Nuclear Power Inspectorate 1988). It is evident from the graphical comparisons presented that the MOTIF solutions for the fourteen verification cases are generally in excellent agreement with known analytical or numerical solutions obtained from independent sources. This series of verification studies has established the ability of the MOTIF finite-element code to accurately model the groundwater flow and solute and heat transport phenomena for which it is intended. (author). 20 refs., 14 tabs., 32 figs
Coordinated transcriptional regulation of two key genes in the lignin branch pathway--CAD and CCR--is mediated through MYB- binding sites.

Science.gov (United States)

Rahantamalala, Anjanirina; Rech, Philippe; Martinez, Yves; Chaubet-Gigot, Nicole; Grima-Pettenati, Jacqueline; Pacquit, Valérie

2010-06-28

Cinnamoyl CoA reductase (CCR) and cinnamyl alcohol dehydrogenase (CAD) catalyze the final steps in the biosynthesis of monolignols, the monomeric units of the phenolic lignin polymers which confer rigidity, imperviousness and resistance to biodegradation to cell walls. We have previously shown that the Eucalyptus gunnii CCR and CAD2 promoters direct similar expression patterns in vascular tissues suggesting that monolignol production is controlled, at least in part, by the coordinated transcriptional regulation of these two genes. Although consensus motifs for MYB transcription factors occur in most gene promoters of the whole phenylpropanoid pathway, functional evidence for their contribution to promoter activity has only been demonstrated for a few of them. Here, in the lignin-specific branch, we studied the functional role of MYB elements as well as other cis-elements identified in the regulatory regions of EgCAD2 and EgCCR promoters, in the transcriptional activity of these gene promoters. By using promoter deletion analysis and in vivo footprinting, we identified an 80 bp regulatory region in the Eucalyptus gunnii EgCAD2 promoter that contains two MYB elements, each arranged in a distinct module with newly identified cis-elements. A directed mutagenesis approach was used to introduce block mutations in all putative cis-elements of the EgCAD2 promoter and in those of the 50 bp regulatory region previously delineated in the EgCCR promoter. We showed that the conserved MYB elements in EgCAD2 and EgCCR promoters are crucial both for the formation of DNA-protein complexes in EMSA experiments and for the transcriptional activation of EgCAD2 and EgCCR promoters in vascular tissues in planta. In addition, a new regulatory cis-element that modulates the balance between two DNA-protein complexes in vitro was found to be important for EgCAD2 expression in the cambial zone. Our assignment of functional roles to the identified cis-elements clearly demonstrates the
Monoclonal antibodies to DNA modified with cis- or trans-diamminedichloroplatinum(II)

International Nuclear Information System (INIS)

Sundquist, W.I.; Lippard, S.J.; Stollar, B.D.

1987-01-01

Murine monoclonal antibodies that bind selectively to adducts formed on DNA by the antitumor drug cis-diamminedichloroplatinum(II), cis-DDP, or to the chemothrapeutically inactive trans isomer trans-DDP were elicited by immunization with calf thymus DNA modified with either cis- or trans-DDP at ratios of bound platinum per nucleotide, (D/N)/sub b/, of 0.06-0.08. The binding of two monoclonal antibodies to cis-DDP-modified DNA was competitively inhibited in an enzyme-linked immunosorbent assay (ELISA) by 4-6 nM concentrations of cis-DDP bound to DNA. Adducts formed by cis-DDP on other synthetic DNA polymers did not inhibit antibody binding to cis-DDP-DNA. The biologically active compounds [Pt(en)Cl 2 ], [Pt(dach)Cl 2 ], and [Pt(NH 3 ) 2 (cbdca)] (carboplatin) all formed antibody-detectable adducts on DNA, whereas the inactive platinum complexes trans-DDP and [Pt(dien)Cl]Cl (dien, diethylenetriamine) did not. The monoclonal antibodies therefore recognize a bifunctional Pt-DNA adduct with cis stereochemistry in which platinum is coordinated by two adjacent guanines or, to a lesser degree, by adjacent adenine and guanine. A monoclonal antibody raised against trans-DDP-DNA was competitively inhibited in an ELISA by 40 nM trans-DDP bound to DNA. This antibody crossreacted with unmodified, denatured DNA. The recognition of cis- or trans-DDP-modified DNAs by monoclonal antibodies thus parallels the known modes of DNA binding of these compounds and may correlate with their biological activities
Stepwise encapsulation and controlled two-stage release system for cis-Diamminediiodoplatinum

Directory of Open Access Journals (Sweden)

Chen Y

2014-06-01

Full Text Available Yun Chen,1,* Qian Li,1,2,* Qingsheng Wu1 1Department of Chemistry, Key Laboratory of Yangtze River Water Environment, Ministry of Education, Tongji University, Shanghai; 2Shanghai Institute of Quality Inspection and Technical Research, Shanghai, People’s Republic of China *These authors contributed equally to this work Abstract: cis-Diamminediiodoplatinum (cis-DIDP is a cisplatin-like anticancer drug with higher anticancer activity, but lower stability and price than cisplatin. In this study, a cis-DIDP carrier system based on micro-sized stearic acid was prepared by an emulsion solvent evaporation method. The maximum drug loading capacity of cis-DIDP-loaded solid lipid nanoparticles was 22.03%, and their encapsulation efficiency was 97.24%. In vitro drug release in phosphate-buffered saline (pH =7.4 at 37.5°C exhibited a unique two-stage process, which could prove beneficial for patients with tumors and malignancies. MTT (3-[4,5-dimethylthiazol-2-yl]-2, 5-diphenyltetrazolium bromide assay results showed that cis-DIDP released from cis-DIDP-loaded solid lipid nanoparticles had better inhibition activity than cis-DIDP that had not been loaded. Keywords: stearic acid, emulsion solvent evaporation method, drug delivery, cis-DIDP, in vitro
Connections between Transcription Downstream of Genes and cis-SAGe Chimeric RNA.

Science.gov (United States)

Chwalenia, Katarzyna; Qin, Fujun; Singh, Sandeep; Tangtrongstittikul, Panjapon; Li, Hui

2017-11-22

cis-Splicing between adjacent genes (cis-SAGe) is being recognized as one way to produce chimeric fusion RNAs. However, its detail mechanism is not clear. Recent study revealed induction of transcriptions downstream of genes (DoGs) under osmotic stress. Here, we investigated the influence of osmotic stress on cis-SAGe chimeric RNAs and their connection to DoGs. We found,the absence of induction of at least some cis-SAGe fusions and/or their corresponding DoGs at early time point(s). In fact, these DoGs and their cis-SAGe fusions are inversely correlated. This negative correlation was changed to positive at a later time point. These results suggest a direct competition between the two categories of transcripts when total pool of readthrough transcripts is limited at an early time point. At a later time point, DoGs and corresponding cis-SAGe fusions are both induced, indicating that total readthrough transcripts become more abundant. Finally, we observed overall enhancement of cis-SAGe chimeric RNAs in KCl-treated samples by RNA-Seq analysis.
LASSIM-A network inference toolbox for genome-wide mechanistic modeling.

Directory of Open Access Journals (Sweden)

Rasmus Magnusson

2017-06-01

Full Text Available Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM, which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE for gene regulatory networks (GRNs. LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models
Enantioselective disruption of the endocrine system by Cis-Bifenthrin in the male mice.

Science.gov (United States)

Jin, Yuanxiang; Wang, Jiangcong; Pan, Xiuhong; Miao, Wenyu; Lin, Xiaojian; Wang, Linggang; Fu, Zhengwei

2015-07-01

Bifenthrin (BF), as a chiral pyrethroid, is widely used to control field and household pests in China. At present, the commercial BF is a mixed compound containing cis isomers (cis-BF) including two enantiomers of 1R-cis-BF and 1S-cis-BF. In the present study, the two individual cis-BF enantiomers were separated by a preparative supercritical fluid chromatography. Then, four week-old adolescent male ICR mice were orally administered 1R-cis-BF and 1S-cis-BF separately daily for 3 weeks at doses of 0, 7.5 and 15 mg/kg/day, respectively. Results showed that the transcription status of some genes involved in cholesterol synthesis and transport as well as testosterone (T) synthesis in the testes were influenced by cis-BF enantiomers. Especially, we observed that the transcription status of key genes on the pathway of T synthesis including cytochrome P450 cholesterol side-chain cleavage enzyme (P450scc) and cytochrome P450 17α-hydroxysteroid dehydrogenase (P45017α)) were selectively altered in the testis of mice when treated with 1S-cis-BF, suggesting that it is the possible reason to explain why the lower serum T concentration in 1S-cis-BF treated group. Taken together, it concluded that both of the cis-BF enantiomers have the endocrine disruption activities, while 1S-cis-BF was higher than 1R-cis-BF in mice when exposed during the puberty. The data was helpful to understand the toxicity of cis-BF in mammals under enantiomeric level. © 2014 Wiley Periodicals, Inc.
Purification and functional motifs of the recombinant ATPase of orf virus.

Science.gov (United States)

Lin, Fong-Yuan; Chan, Kun-Wei; Wang, Chi-Young; Wong, Min-Liang; Hsu, Wei-Li

2011-10-01

Our previous study showed that the recombinant ATPase encoded by the A32L gene of orf virus displayed ATP hydrolysis activity as predicted from its amino acids sequence. This viral ATPase contains four known functional motifs (motifs I-IV) and a novel AYDG motif; they are essential for ATP hydrolysis reaction by binding ATP and magnesium ions. The motifs I and II correspond with the Walker A and B motifs of the typical ATPase, respectively. To examine the biochemical roles of these five conserved motifs, recombinant ATPases of five deletion mutants derived from the Taiping strain were expressed and purified. Their ATPase functions were assayed and compared with those of two wild type strains, Taiping and Nantou isolated in Taiwan. Our results showed that deletions at motifs I-III or IV exhibited lower activity than that of the wild type. Interestingly, deletion of AYDG motif decreased the ATPase activity more significantly than those of motifs I-IV deletions. Divalent ions such as magnesium and calcium were essential for ATPase activity. Moreover, our recombinant proteins of orf virus also demonstrated GTPase activity, though weaker than the original ATPase activity. Copyright © 2011 Elsevier Inc. All rights reserved.
Integrated systems approach identifies risk regulatory pathways and key regulators in coronary artery disease.

Science.gov (United States)

Zhang, Yan; Liu, Dianming; Wang, Lihong; Wang, Shuyuan; Yu, Xuexin; Dai, Enyu; Liu, Xinyi; Luo, Shanshun; Jiang, Wei

2015-12-01

Coronary artery disease (CAD) is the most common type of heart disease. However, the molecular mechanisms of CAD remain elusive. Regulatory pathways are known to play crucial roles in many pathogenic processes. Thus, inferring risk regulatory pathways is an important step toward elucidating the mechanisms underlying CAD. With advances in high-throughput data, we developed an integrated systems approach to identify CAD risk regulatory pathways and key regulators. Firstly, a CAD-related core subnetwork was identified from a curated transcription factor (TF) and microRNA (miRNA) regulatory network based on a random walk algorithm. Secondly, candidate risk regulatory pathways were extracted from the subnetwork by applying a breadth-first search (BFS) algorithm. Then, risk regulatory pathways were prioritized based on multiple CAD-associated data sources. Finally, we also proposed a new measure to prioritize upstream regulators. We inferred that phosphatase and tensin homolog (PTEN) may be a key regulator in the dysregulation of risk regulatory pathways. This study takes a closer step than the identification of disease subnetworks or modules. From the risk regulatory pathways, we could understand the flow of regulatory information in the initiation and progression of the disease. Our approach helps to uncover its potential etiology. We developed an integrated systems approach to identify risk regulatory pathways. We proposed a new measure to prioritize the key regulators in CAD. PTEN may be a key regulator in dysregulation of the risk regulatory pathways.
RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

Science.gov (United States)

Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

2013-11-01

Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in
An experimental test of a fundamental food web motif.

Science.gov (United States)

Rip, Jason M K; McCann, Kevin S; Lynn, Denis H; Fawcett, Sonia

2010-06-07

Large-scale changes to the world's ecosystem are resulting in the deterioration of biostructure-the complex web of species interactions that make up ecological communities. A difficult, yet crucial task is to identify food web structures, or food web motifs, that are the building blocks of this baroque network of interactions. Once identified, these food web motifs can then be examined through experiments and theory to provide mechanistic explanations for how structure governs ecosystem stability. Here, we synthesize recent ecological research to show that generalist consumers coupling resources with different interaction strengths, is one such motif. This motif amazingly occurs across an enormous range of spatial scales, and so acts to distribute coupled weak and strong interactions throughout food webs. We then perform an experiment that illustrates the importance of this motif to ecological stability. We find that weak interactions coupled to strong interactions by generalist consumers dampen strong interaction strengths and increase community stability. This study takes a critical step by isolating a common food web motif and through clear, experimental manipulation, identifies the fundamental stabilizing consequences of this structure for ecological communities.
The gene regulatory network for breast cancer: Integrated regulatory landscape of cancer hallmarks

Directory of Open Access Journals (Sweden)

Frank eEmmert-Streib

2014-02-01

Full Text Available In this study, we infer the breast cancer gene regulatory network from gene expression data. This network is obtained from the application of the BC3Net inference algorithm to a large-scale gene expression data set consisting of $351$ patient samples. In order to elucidate the functional relevance of the inferred network, we are performing a Gene Ontology (GO analysis for its structural components. Our analysis reveals that most significant GO-terms we find for the breast cancer network represent functional modules of biological processes that are described by known cancer hallmarks, including translation, immune response, cell cycle, organelle fission, mitosis, cell adhesion, RNA processing, RNA splicing and response to wounding. Furthermore, by using a curated list of census cancer genes, we find an enrichment in these functional modules. Finally, we study cooperative effects of chromosomes based on information of interacting genes in the beast cancer network. We find that chromosome $21$ is most coactive with other chromosomes. To our knowledge this is the first study investigating the genome-scale breast cancer network.
Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints

OpenAIRE

Schwessinger, R; Suciu, MC; McGowan, SJ; Telenius, J; Taylor, S; Higgs, DR; Hughes, JR

2017-01-01

In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor bin...
Highly scalable Ab initio genomic motif identification

KAUST Repository

Marchand, Benoit; Bajic, Vladimir B.; Kaushik, Dinesh

2011-01-01

We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time. Copyright 2011 ACM.
Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

International Nuclear Information System (INIS)

Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A.

2005-01-01

Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes

Mechanisms of zero-lag synchronization in cortical motifs.

Directory of Open Access Journals (Sweden)

Leonardo L Gollo

2014-04-01

Full Text Available Zero-lag synchronization between distant cortical areas has been observed in a diversity of experimental data sets and between many different regions of the brain. Several computational mechanisms have been proposed to account for such isochronous synchronization in the presence of long conduction delays: Of these, the phenomenon of "dynamical relaying"--a mechanism that relies on a specific network motif--has proven to be the most robust with respect to parameter mismatch and system noise. Surprisingly, despite a contrary belief in the community, the common driving motif is an unreliable means of establishing zero-lag synchrony. Although dynamical relaying has been validated in empirical and computational studies, the deeper dynamical mechanisms and comparison to dynamics on other motifs is lacking. By systematically comparing synchronization on a variety of small motifs, we establish that the presence of a single reciprocally connected pair--a "resonance pair"--plays a crucial role in disambiguating those motifs that foster zero-lag synchrony in the presence of conduction delays (such as dynamical relaying from those that do not (such as the common driving triad. Remarkably, minor structural changes to the common driving motif that incorporate a reciprocal pair recover robust zero-lag synchrony. The findings are observed in computational models of spiking neurons, populations of spiking neurons and neural mass models, and arise whether the oscillatory systems are periodic, chaotic, noise-free or driven by stochastic inputs. The influence of the resonance pair is also robust to parameter mismatch and asymmetrical time delays amongst the elements of the motif. We call this manner of facilitating zero-lag synchrony resonance-induced synchronization, outline the conditions for its occurrence, and propose that it may be a general mechanism to promote zero-lag synchrony in the brain.
RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

Science.gov (United States)

Brule, C E; Dean, K M; Grayhack, E J

2016-01-01

The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.
The efficacy of 9-cis retinoic acid in experimental models of cancer.

Science.gov (United States)

Gottardis, M M; Lamph, W W; Shalinsky, D R; Wellstein, A; Heyman, R A

1996-01-01

9-cis retinoic acid (9-cis RA) is a retinoid receptor pan-agonist that binds with high affinity to both retinoic acid receptors (RARs) and retinoid X receptors (RXRs). Using a variety of in vivo and in vitro cancer models, we present experimental data that 9-cis RA has activity as a potential chemotherapeutic agent. Treatment of the human promyelocytic leukemia cell line HL-60 with 9-cis RA decreases cell proliferation, increases cell differentiation, and increases apoptosis. Induction of apoptosis correlates with an increase in tissue transglutaminase (type II) activity. In vivo, 9-cis RA induces complete tumor regression of an early passage human lip squamous cell carcinoma xenograft. Finally, 9-cis RA inhibits the anchorage-independent growth of the human breast cancer cell lines MCF-7 and LY2 (an antiestrogen-resistant MCF-7 variant). Transient co-transfection assays indicate that 9-cis RA inhibits estrogen receptor transcription of an ERE-tk-LUC reporter through RAR or RXR receptors. These data suggest that retinoid receptors can antagonize estrogen-dependent transcription and provides one possible mechanism for the inhibition of cell growth by 9-cis RA in breast cancer cell lines. In summary, these findings present evidence that 9-cis RA has a wide range of activities in human cancer models.
Transduction motif analysis of gastric cancer based on a human signaling network

Energy Technology Data Exchange (ETDEWEB)

Liu, G.; Li, D.Z.; Jiang, C.S.; Wang, W. [Fuzhou General Hospital of Nanjing Command, Department of Gastroenterology, Fuzhou, China, Department of Gastroenterology, Fuzhou General Hospital of Nanjing Command, Fuzhou (China)

2014-04-04

To investigate signal regulation models of gastric cancer, databases and literature were used to construct the signaling network in humans. Topological characteristics of the network were analyzed by CytoScape. After marking gastric cancer-related genes extracted from the CancerResource, GeneRIF, and COSMIC databases, the FANMOD software was used for the mining of gastric cancer-related motifs in a network with three vertices. The significant motif difference method was adopted to identify significantly different motifs in the normal and cancer states. Finally, we conducted a series of analyses of the significantly different motifs, including gene ontology, function annotation of genes, and model classification. A human signaling network was constructed, with 1643 nodes and 5089 regulating interactions. The network was configured to have the characteristics of other biological networks. There were 57,942 motifs marked with gastric cancer-related genes out of a total of 69,492 motifs, and 264 motifs were selected as significantly different motifs by calculating the significant motif difference (SMD) scores. Genes in significantly different motifs were mainly enriched in functions associated with cancer genesis, such as regulation of cell death, amino acid phosphorylation of proteins, and intracellular signaling cascades. The top five significantly different motifs were mainly cascade and positive feedback types. Almost all genes in the five motifs were cancer related, including EPOR, MAPK14, BCL2L1, KRT18, PTPN6, CASP3, TGFBR2, AR, and CASP7. The development of cancer might be curbed by inhibiting signal transductions upstream and downstream of the selected motifs.
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

Science.gov (United States)

Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

2012-01-01

Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to
Armadillo motifs involved in vesicular transport.

Directory of Open Access Journals (Sweden)

Harald Striegl

Full Text Available Armadillo (ARM repeat proteins function in various cellular processes including vesicular transport and membrane tethering. They contain an imperfect repeating sequence motif that forms a conserved three-dimensional structure. Recently, structural and functional insight into tethering mediated by the ARM-repeat protein p115 has been provided. Here we describe the p115 ARM-motifs for reasons of clarity and nomenclature and show that both sequence and structure are highly conserved among ARM-repeat proteins. We argue that there is no need to invoke repeat types other than ARM repeats for a proper description of the structure of the p115 globular head region. Additionally, we propose to define a new subfamily of ARM-like proteins and show lack of evidence that the ARM motifs found in p115 are present in other long coiled-coil tethering factors of the golgin family.
A tandem sequence motif acts as a distance-dependent enhancer in a set of genes involved in translation by binding the proteins NonO and SFPQ

Directory of Open Access Journals (Sweden)

Roepcke Stefan

2011-12-01

Full Text Available Abstract Background Bioinformatic analyses of expression control sequences in promoters of co-expressed or functionally related genes enable the discovery of common regulatory sequence motifs that might be involved in co-ordinated gene expression. By studying promoter sequences of the human ribosomal protein genes we recently identified a novel highly specific Localized Tandem Sequence Motif (LTSM. In this work we sought to identify additional genes and LTSM-binding proteins to elucidate potential regulatory mechanisms. Results Genome-wide analyses allowed finding a considerable number of additional LTSM-positive genes, the products of which are involved in translation, among them, translation initiation and elongation factors, and 5S rRNA. Electromobility shift assays then showed specific signals demonstrating the binding of protein complexes to LTSM in ribosomal protein gene promoters. Pull-down assays with LTSM-containing oligonucleotides and subsequent mass spectrometric analysis identified the related multifunctional nucleotide binding proteins NonO and SFPQ in the binding complex. Functional characterization then revealed that LTSM enhances the transcriptional activity of the promoters in dependency of the distance from the transcription start site. Conclusions Our data demonstrate the power of bioinformatic analyses for the identification of biologically relevant sequence motifs. LTSM and the here found LTSM-binding proteins NonO and SFPQ were discovered through a synergistic combination of bioinformatic and biochemical methods and are regulators of the expression of a set of genes of the translational apparatus in a distance-dependent manner.
Effects of cis-9, trans-11 and trans-10, cis-12 conjugated linoleic acid (CLA) isomers on immune function in healthy men

NARCIS (Netherlands)

Albers, R.; Wielen, R.P.J. van der; Brink, E.J.; Hendriks, H.F.J.; Dorovska-Taran, V.N.; Mohede, I.C.M.

2003-01-01

Objectives: To study the effects of two different mixtures of the main conjugated linoleic acid (CLA) isomers cis-9, trans-11 CLA and trans-10, cis-12 CLA on human immune function. Design: Double-blind, randomized, parallel, reference-controlled intervention study. Subjects and intervention:
Insights into the evolution and diversification of the AT-hook Motif Nuclear Localized gene family in land plants.

Science.gov (United States)

Zhao, Jianfei; Favero, David S; Qiu, Jiwen; Roalson, Eric H; Neff, Michael M

2014-10-14

Members of the ancient land-plant-specific transcription factor AT-Hook Motif Nuclear Localized (AHL) gene family regulate various biological processes. However, the relationships among the AHL genes, as well as their evolutionary history, still remain unexplored. We analyzed over 500 AHL genes from 19 land plant species, ranging from the early diverging Physcomitrella patens and Selaginella to a variety of monocot and dicot flowering plants. We classified the AHL proteins into three types (Type-I/-II/-III) based on the number and composition of their functional domains, the AT-hook motif(s) and PPC domain. We further inferred their phylogenies via Bayesian inference analysis and predicted gene gain/loss events throughout their diversification. Our analyses suggested that the AHL gene family emerged in embryophytes and further evolved into two distinct clades, with Type-I AHLs forming one clade (Clade-A), and the other two types together diversifying in another (Clade-B). The two AHL clades likely diverged before the separation of Physcomitrella patens from the vascular plant lineage. In angiosperms, Clade-A AHLs expanded into 5 subfamilies; while, the ones in Clade-B expanded into 4 subfamilies. Examination of their expression patterns suggests that the AHLs within each clade share similar expression patterns with each other; however, AHLs in one monophyletic clade exhibit distinct expression patterns from the ones in the other clade. Over-expression of a Glycine max AHL PPC domain in Arabidopsis thaliana recapitulates the phenotype observed when over-expressing its Arabidopsis thaliana counterpart. This result suggests that the AHL genes from different land plant species may share conserved functions in regulating plant growth and development. Our study further suggests that such functional conservation may be due to conserved physical interactions among the PPC domains of AHL proteins. Our analyses reveal a possible evolutionary scenario for the AHL gene family
Characterizing Motif Dynamics of Electric Brain Activity Using Symbolic Analysis

Directory of Open Access Journals (Sweden)

Massimiliano Zanin

2014-10-01

Full Text Available Motifs are small recurring circuits of interactions which constitute the backbone of networked systems. Characterizing motif dynamics is therefore key to understanding the functioning of such systems. Here we propose a method to define and quantify the temporal variability and time scales of electroencephalogram (EEG motifs of resting brain activity. Given a triplet of EEG sensors, links between them are calculated by means of linear correlation; each pattern of links (i.e., each motif is then associated to a symbol, and its appearance frequency is analyzed by means of Shannon entropy. Our results show that each motif becomes observable with different coupling thresholds and evolves at its own time scale, with fronto-temporal sensors emerging at high thresholds and changing at fast time scales, and parietal ones at low thresholds and changing at slower rates. Finally, while motif dynamics differed across individuals, for each subject, it showed robustness across experimental conditions, indicating that it could represent an individual dynamical signature.
Discriminative motif discovery via simulated evolution and random under-sampling.

Directory of Open Access Journals (Sweden)

Tao Song

Full Text Available Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.
Discriminative motif discovery via simulated evolution and random under-sampling.

Science.gov (United States)

Song, Tao; Gu, Hong

2014-01-01

Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.
Improved i-motif thermal stability by insertion of anthraquinone monomers

DEFF Research Database (Denmark)

Gouda, Alaa S; Amine, Mahasen S.; Pedersen, Erik Bjerregaard

2017-01-01

In order to gain insight into how to improve thermal stability of i-motifs when used in the context of biomedical and nanotechnological applications, novel anthraquinone-modified i-motifs were synthesized by insertion of 1,8-, 1,4-, 1,5- and 2,6-disubstituted anthraquinone monomers into the TAA...... loops of a 22mer cytosine-rich human telomeric DNA sequence. The influence of the four anthraquinone linkers on the i-motif thermal stability was investigated at 295 nm and pH 5.5. Anthraquinone monomers modulate the i-motif stability in a position-depending manner and the modulation also depends...... unlocked nucleic acid monomers or twisted intercalating nucleic acid. The 2,6-disubstituted anthraquinone linker replacing T10 enabled a significant increase of i-motif thermal melting by 8.2 °C. A substantial increase of 5.0 °C in i-motif thermal melting was recorded when both A6 and T16 were modified...
Identification of a cis-regulatory region of a gene in Arabidopsis thaliana whose induction by dehydration is mediated by abscisic acid and requires protein synthesis.

Science.gov (United States)

Iwasaki, T; Yamaguchi-Shinozaki, K; Shinozaki, K

1995-05-20

In Arabidopsis thaliana, the induction of a dehydration-responsive gene, rd22, is mediated by abscisic acid (ABA) but the gene does not include any sequence corresponding to the consensus ABA-responsive element (ABRE), RYACGTGGYR, in its promoter region. The cis-regulatory region of the rd22 promoter was identified by monitoring the expression of beta-glucuronidase (GUS) activity in leaves of transgenic tobacco plants transformed with chimeric gene fusions constructed between 5'-deleted promoters of rd22 and the coding region of the GUS reporter gene. A 67-bp nucleotide fragment corresponding to positions -207 to -141 of the rd22 promoter conferred responsiveness to dehydration and ABA on a non-responsive promoter. The 67-bp fragment contains the sequences of the recognition sites for some transcription factors, such as MYC, MYB, and GT-1. The fact that accumulation of rd22 mRNA requires protein synthesis raises the possibility that the expression of rd22 might be regulated by one of these trans-acting protein factors whose de novo synthesis is induced by dehydration or ABA. Although the structure of the RD22 protein is very similar to that of a non-storage seed protein, USP, of Vicia faba, the expression of the GUS gene driven by the rd22 promoter in non-stressed transgenic Arabidopsis plants was found mainly in flowers and bolted stems rather than in seeds.
Single nucleotide resolution RNA-seq uncovers new regulatory mechanisms in the opportunistic pathogen Streptococcus agalactiae.

Science.gov (United States)

Rosinski-Chupin, Isabelle; Sauvage, Elisabeth; Sismeiro, Odile; Villain, Adrien; Da Cunha, Violette; Caliot, Marie-Elise; Dillies, Marie-Agnès; Trieu-Cuot, Patrick; Bouloc, Philippe; Lartigue, Marie-Frédérique; Glaser, Philippe

2015-05-30

Streptococcus agalactiae, or Group B Streptococcus, is a leading cause of neonatal infections and an increasing cause of infections in adults with underlying diseases. In an effort to reconstruct the transcriptional networks involved in S. agalactiae physiology and pathogenesis, we performed an extensive and robust characterization of its transcriptome through a combination of differential RNA-sequencing in eight different growth conditions or genetic backgrounds and strand-specific RNA-sequencing. Our study identified 1,210 transcription start sites (TSSs) and 655 transcript ends as well as 39 riboswitches and cis-regulatory regions, 39 cis-antisense non-coding RNAs and 47 small RNAs potentially acting in trans. Among these putative regulatory RNAs, ten were differentially expressed in response to an acid stress and two riboswitches sensed directly or indirectly the pH modification. Strikingly, 15% of the TSSs identified were associated with the incorporation of pseudo-templated nucleotides, showing that reiterative transcription is a pervasive process in S. agalactiae. In particular, 40% of the TSSs upstream genes involved in nucleotide metabolism show reiterative transcription potentially regulating gene expression, as exemplified for pyrG and thyA encoding the CTP synthase and the thymidylate synthase respectively. This comprehensive map of the transcriptome at the single nucleotide resolution led to the discovery of new regulatory mechanisms in S. agalactiae. It also provides the basis for in depth analyses of transcriptional networks in S. agalactiae and of the regulatory role of reiterative transcription following variations of intra-cellular nucleotide pools.
Application of the Gini Correlation Coefficient to Infer Regulatory Relationships in Transcriptome Analysis[W][OA

Science.gov (United States)

Ma, Chuang; Wang, Xiangfeng

2012-01-01

One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey’s biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses. PMID:22797655
Repetitive Elements in Mycoplasma hyopneumoniae Transcriptional Regulation.

Directory of Open Access Journals (Sweden)

Amanda Malvessi Cattani

Full Text Available Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.
Repetitive Elements in Mycoplasma hyopneumoniae Transcriptional Regulation.

Science.gov (United States)

Cattani, Amanda Malvessi; Siqueira, Franciele Maboni; Guedes, Rafael Lucas Muniz; Schrank, Irene Silveira

2016-01-01

Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.
Effects of pyrethroid pesticide cis-bifenthrin on lipogenesis in hepatic cell line.

Science.gov (United States)

Xiang, Dandan; Chu, Tianyi; Li, Meng; Wang, Qiangwei; Zhu, Guonian

2018-06-01

Mounting evidence suggests there is a link between exposure to synthetic pyrethroids (SPs) and the development of obesity. The information presented in this study suggests that cis-bifenthrin (cis-BF) could activate pregnane X receptor (PXR) mediated pathway and lead to the lipid accumulation of human hepatoma (HepG2) cells. Cells were incubated in the control or different concentrations of cis-BF for 24 h. The 1 × 10 -7  M and 1 × 10 -6  M cis-BF exposure were found to induce cellular triglyceride (TG) accumulation significantly. This phenomenon was further supported by Oil Red O Staining assay. The cis-BF exposure caused upregulation of PXR gene and protein. Correspondingly, we also observed the increased expression of downstream genes involved in lipid formation and the inhibition of the expression of β-oxidation. As chiral pesticide，cis-BF was further conformed to behave enantioselectivity in the lipid metabolism. Rather than 1R-cis-BF, HepG2 cells incubated with 1S-cis-BF exhibited a significant TG accumulation. 1S-cis-BF also showed a higher binding level, of which the KD value was 9.184 × 10 -8  M in the SPR assay, compared with 1R-cis-BF (3.463 × 10 -6  M). In addition, the molecular docking simulation analyses correlated well with the KD values measured by the SPR, indicating that 1S-cis-BF showed a better binding affinity with PXR. The results in this study also elucidates the differences between the two enantiomers of pyrethroid-induced toxicity in lipid metabolism of non-target organism. Copyright © 2018 Elsevier Ltd. All rights reserved.
Learning gene regulatory networks from only positive and unlabeled data

Directory of Open Access Journals (Sweden)

Elkan Charles

2010-05-01

Full Text Available Abstract Background Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact. Results A recent advance in research on data mining is a method capable of learning a classifier from only positive and unlabeled examples, that does not need labeled negative examples. Applied to the reconstruction of gene regulatory networks, we show that this method significantly outperforms the current state of the art of machine learning methods. We assess the new method using both simulated and experimental data, and obtain major performance improvement. Conclusions Compared to unsupervised methods for gene network inference, supervised methods are potentially more accurate, but for training they need a complete set of known regulatory connections. A supervised method that can be trained using only positive and unlabeled data, as presented in this paper, is especially beneficial for the task of inferring gene regulatory networks, because only an incomplete set of known regulatory connections is available in public databases such as RegulonDB, TRRD, KEGG, Transfac, and IPA.

Phyloproteomic Analysis of 11780 Six-Residue-Long Motifs Occurrences

Directory of Open Access Journals (Sweden)

O. V. Galzitskaya

2015-01-01

Full Text Available How is it possible to find good traits for phylogenetic reconstructions? Here, we present a new phyloproteomic criterion that is an occurrence of simple motifs which can be imprints of evolution history. We studied the occurrences of 11780 six-residue-long motifs consisting of two randomly located amino acids in 97 eukaryotic and 25 bacterial proteomes. For all eukaryotic proteomes, with the exception of the Amoebozoa, Stramenopiles, and Diplomonadida kingdoms, the number of proteins containing the motifs from the first group (one of the two amino acids occurs once at the terminal position made about 20%; in the case of motifs from the second (one of two amino acids occurs one time within the pattern and third (the two amino acids occur randomly groups, 30% and 50%, respectively. For bacterial proteomes, this relationship was 10%, 27%, and 63%, respectively. The matrices of correlation coefficients between numbers of proteins where a motif from the set of 11780 motifs appears at least once in 9 kingdoms and 5 phyla of bacteria were calculated. Among the correlation coefficients for eukaryotic proteomes, the correlation between the animal and fungi kingdoms (0.62 is higher than between fungi and plants (0.54. Our study provides support that animals and fungi are sibling kingdoms. Comparison of the frequencies of six-residue-long motifs in different proteomes allows obtaining phylogenetic relationships based on similarities between these frequencies: the Diplomonadida kingdoms are more close to Bacteria than to Eukaryota; Stramenopiles and Amoebozoa are more close to each other than to other kingdoms of Eukaryota.
Enzymatic study on AtCCD4 and AtCCD7 and their potential to form acyclic regulatory metabolites

KAUST Repository

Bruno, Mark

2016-09-29

The Arabidopsis carotenoid cleavage dioxygenase 4 (AtCCD4) is a negative regulator of the carotenoid content of seeds and has recently been suggested as a candidate for the generation of retrograde signals that are thought to derive from the cleavage of poly-cis-configured carotene desaturation intermediates. In this work, we investigated the activity of AtCCD4 in vitro and used dynamic modeling to determine its substrate preference. Our results document strict regional specificity for cleavage at the C9–C10 double bond in carotenoids and apocarotenoids, with preference for carotenoid substrates and an obstructing effect on hydroxyl functions, and demonstrate the specificity for all-trans-configured carotenes and xanthophylls. AtCCD4 cleaved substrates with at least one ionone ring and did not convert acyclic carotene desaturation intermediates, independent of their isomeric states. These results do not support a direct involvement of AtCCD4 in generating the supposed regulatory metabolites. In contrast, the strigolactone biosynthetic enzyme AtCCD7 converted 9-cis-configured acyclic carotenes, such as 9-cis-ζ-carotene, 9\\'-cis-neurosporene, and 9-cis-lycopene, yielding 9-cis-configured products and indicating that AtCCD7, rather than AtCCD4, is the candidate for forming acyclic retrograde signals.
Enzymatic study on AtCCD4 and AtCCD7 and their potential to form acyclic regulatory metabolites

KAUST Repository

Bruno, Mark; Koschmieder, Julian; Wuest, Florian; Schaub, Patrick; Fehling-Kaschek, Mirjam; Timmer, Jens; Beyer, Peter; Al-Babili, Salim

2016-01-01

The Arabidopsis carotenoid cleavage dioxygenase 4 (AtCCD4) is a negative regulator of the carotenoid content of seeds and has recently been suggested as a candidate for the generation of retrograde signals that are thought to derive from the cleavage of poly-cis-configured carotene desaturation intermediates. In this work, we investigated the activity of AtCCD4 in vitro and used dynamic modeling to determine its substrate preference. Our results document strict regional specificity for cleavage at the C9–C10 double bond in carotenoids and apocarotenoids, with preference for carotenoid substrates and an obstructing effect on hydroxyl functions, and demonstrate the specificity for all-trans-configured carotenes and xanthophylls. AtCCD4 cleaved substrates with at least one ionone ring and did not convert acyclic carotene desaturation intermediates, independent of their isomeric states. These results do not support a direct involvement of AtCCD4 in generating the supposed regulatory metabolites. In contrast, the strigolactone biosynthetic enzyme AtCCD7 converted 9-cis-configured acyclic carotenes, such as 9-cis-ζ-carotene, 9'-cis-neurosporene, and 9-cis-lycopene, yielding 9-cis-configured products and indicating that AtCCD7, rather than AtCCD4, is the candidate for forming acyclic retrograde signals.
Enzymatic study on AtCCD4 and AtCCD7 and their potential to form acyclic regulatory metabolites

Science.gov (United States)

Bruno, Mark; Koschmieder, Julian; Wuest, Florian; Schaub, Patrick; Fehling-Kaschek, Mirjam; Timmer, Jens; Beyer, Peter; Al-Babili, Salim

2016-01-01

The Arabidopsis carotenoid cleavage dioxygenase 4 (AtCCD4) is a negative regulator of the carotenoid content of seeds and has recently been suggested as a candidate for the generation of retrograde signals that are thought to derive from the cleavage of poly-cis-configured carotene desaturation intermediates. In this work, we investigated the activity of AtCCD4 in vitro and used dynamic modeling to determine its substrate preference. Our results document strict regional specificity for cleavage at the C9–C10 double bond in carotenoids and apocarotenoids, with preference for carotenoid substrates and an obstructing effect on hydroxyl functions, and demonstrate the specificity for all-trans-configured carotenes and xanthophylls. AtCCD4 cleaved substrates with at least one ionone ring and did not convert acyclic carotene desaturation intermediates, independent of their isomeric states. These results do not support a direct involvement of AtCCD4 in generating the supposed regulatory metabolites. In contrast, the strigolactone biosynthetic enzyme AtCCD7 converted 9-cis-configured acyclic carotenes, such as 9-cis-ζ-carotene, 9'-cis-neurosporene, and 9-cis-lycopene, yielding 9-cis-configured products and indicating that AtCCD7, rather than AtCCD4, is the candidate for forming acyclic retrograde signals. PMID:27811075
U.S./CIS eye joint nuclear rocket venture

Science.gov (United States)

Clark, John S.; Mcilwain, Melvin C.; Smetanikov, Vladimir; D'Yakov, Evgenij K.; Pavshuk, Vladimir A.

1993-01-01

An account is given of the significance for U.S. spacecraft development of a nuclear thermal rocket (NTR) reactor concept that has been developed in the (formerly Soviet) Commonwealth of Independent States (CIS). The CIS NTR reactor employs a hydrogen-cooled zirconium hydride moderator and ternary carbide fuels; the comparatively cool operating temperatures associated with this design promise overall robustness.
Deciphering RNA Regulatory Elements Involved in the Developmental and Environmental Gene Regulation of Trypanosoma brucei.

Science.gov (United States)

Gazestani, Vahid H; Salavati, Reza

2015-01-01

Trypanosoma brucei is a vector-borne parasite with intricate life cycle that can cause serious diseases in humans and animals. This pathogen relies on fine regulation of gene expression to respond and adapt to variable environments, with implications in transmission and infectivity. However, the involved regulatory elements and their mechanisms of actions are largely unknown. Here, benefiting from a new graph-based approach for finding functional regulatory elements in RNA (GRAFFER), we have predicted 88 new RNA regulatory elements that are potentially involved in the gene regulatory network of T. brucei. We show that many of these newly predicted elements are responsive to both transcriptomic and proteomic changes during the life cycle of the parasite. Moreover, we found that 11 of predicted elements strikingly resemble previously identified regulatory elements for the parasite. Additionally, comparison with previously predicted motifs on T. brucei suggested the superior performance of our approach based on the current limited knowledge of regulatory elements in T. brucei.
Methods and statistics for combining motif match scores.

Science.gov (United States)

Bailey, T L; Gribskov, M

1998-01-01

Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Inference algorithms and learning theory for Bayesian sparse factor analysis

International Nuclear Information System (INIS)

Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

2009-01-01

Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Inference algorithms and learning theory for Bayesian sparse factor analysis

Energy Technology Data Exchange (ETDEWEB)

Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

2009-12-01

Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Cis By Trans

OpenAIRE

Rodovalho,Amara Moira

2017-01-01

Cis, trans: above all, metaphors. Cisjordan, region skirting the Jordan River. Cisplatin, Uruguay’s ancient name, region occupying one of the banks of the Prata River. Trans- Amazonian, that which crosses the Amazon; transatlantic, that which crosses the Atlantic. Cisalpine, transalpine. The geometric isomerism of Organic Chemistry, where “cis” are atoms that, when molecules are divided in half, remain on the same side, and “trans” those remaining on opposite sides. Ev...
Nsite, NsiteH and NsiteM Computer Tools for Studying Tran-scription Regulatory Elements

KAUST Repository

Shahmuradov, Ilham

2015-07-02

Summary: Gene transcription is mostly conducted through interactions of various transcription factors and their binding sites on DNA (regulatory elements, REs). Today, we are still far from understanding the real regulatory content of promoter regions. Computer methods for identification of REs remain a widely used tool for studying and understanding transcriptional regulation mechanisms. The Nsite, NsiteH and NsiteM programs perform searches for statistically significant (non-random) motifs of known human, animal and plant one-box and composite REs in a single genomic sequence, in a pair of aligned homologous sequences and in a set of functionally related sequences, respectively.
Ancient Pbx-Hox signatures define hundreds of vertebrate developmental enhancers

Directory of Open Access Journals (Sweden)

Parker Hugo J

2011-12-01

Full Text Available Abstract Background Gene regulation through cis-regulatory elements plays a crucial role in development and disease. A major aim of the post-genomic era is to be able to read the function of cis-regulatory elements through scrutiny of their DNA sequence. Whilst comparative genomics approaches have identified thousands of putative regulatory elements, our knowledge of their mechanism of action is poor and very little progress has been made in systematically de-coding them. Results Here, we identify ancient functional signatures within vertebrate conserved non-coding elements (CNEs through a combination of phylogenetic footprinting and functional assay, using genomic sequence from the sea lamprey as a reference. We uncover a striking enrichment within vertebrate CNEs for conserved binding-site motifs of the Pbx-Hox hetero-dimer. We further show that these predict reporter gene expression in a segment specific manner in the hindbrain and pharyngeal arches during zebrafish development. Conclusions These findings evoke an evolutionary scenario in which many CNEs evolved early in the vertebrate lineage to co-ordinate Hox-dependent gene-regulatory interactions that pattern the vertebrate head. In a broader context, our evolutionary analyses reveal that CNEs are composed of tightly linked transcription-factor binding-sites (TFBSs, which can be systematically identified through phylogenetic footprinting approaches. By placing a large number of ancient vertebrate CNEs into a developmental context, our findings promise to have a significant impact on efforts toward de-coding gene-regulatory elements that underlie vertebrate development, and will facilitate building general models of regulatory element evolution.
A national prediction model for PM2.5 component exposures and measurement error-corrected health effect inference.

Science.gov (United States)

Bergen, Silas; Sheppard, Lianne; Sampson, Paul D; Kim, Sun-Young; Richards, Mark; Vedal, Sverre; Kaufman, Joel D; Szpiro, Adam A

2013-09-01

Studies estimating health effects of long-term air pollution exposure often use a two-stage approach: building exposure models to assign individual-level exposures, which are then used in regression analyses. This requires accurate exposure modeling and careful treatment of exposure measurement error. To illustrate the importance of accounting for exposure model characteristics in two-stage air pollution studies, we considered a case study based on data from the Multi-Ethnic Study of Atherosclerosis (MESA). We built national spatial exposure models that used partial least squares and universal kriging to estimate annual average concentrations of four PM2.5 components: elemental carbon (EC), organic carbon (OC), silicon (Si), and sulfur (S). We predicted PM2.5 component exposures for the MESA cohort and estimated cross-sectional associations with carotid intima-media thickness (CIMT), adjusting for subject-specific covariates. We corrected for measurement error using recently developed methods that account for the spatial structure of predicted exposures. Our models performed well, with cross-validated R2 values ranging from 0.62 to 0.95. Naïve analyses that did not account for measurement error indicated statistically significant associations between CIMT and exposure to OC, Si, and S. EC and OC exhibited little spatial correlation, and the corrected inference was unchanged from the naïve analysis. The Si and S exposure surfaces displayed notable spatial correlation, resulting in corrected confidence intervals (CIs) that were 50% wider than the naïve CIs, but that were still statistically significant. The impact of correcting for measurement error on health effect inference is concordant with the degree of spatial correlation in the exposure surfaces. Exposure model characteristics must be considered when performing two-stage air pollution epidemiologic analyses because naïve health effect inference may be inappropriate.
Low-dimensional morphospace of topological motifs in human fMRI brain networks

Directory of Open Access Journals (Sweden)

Sarah E. Morgan

2018-06-01

Full Text Available We present a low-dimensional morphospace of fMRI brain networks, where axes are defined in a data-driven manner based on the network motifs. The morphospace allows us to identify the key variations in healthy fMRI networks in terms of their underlying motifs, and we observe that two principal components (PCs can account for 97% of the motif variability. The first PC of the motif distribution is correlated with efficiency and inversely correlated with transitivity. Hence this axis approximately conforms to the well-known economical small-world trade-off between integration and segregation in brain networks. Finally, we show that the economical clustering generative model proposed by Vértes et al. (2012 can approximately reproduce the motif morphospace of the real fMRI brain networks, in contrast to other generative models. Overall, the motif morphospace provides a powerful way to visualize the relationships between network properties and to investigate generative or constraining factors in the formation of complex human brain functional networks. Motifs have been described as the building blocks of complex networks. Meanwhile, a morphospace allows networks to be placed in a common space and can reveal the relationships between different network properties and elucidate the driving forces behind network topology. We combine the concepts of motifs and morphospaces to create the first motif morphospace of fMRI brain networks. Crucially, the morphospace axes are defined by the motifs, in a data-driven manner. We observe strong correlations between the networks’ positions in morphospace and their global topological properties, suggesting that motif morphospaces are a powerful way to capture the topology of networks in a low-dimensional space and to compare generative models of brain networks. Motif morphospaces could also be used to study other complex networks’ topologies.
Communications and Information Sharing (CIS) Laboratory

Data.gov (United States)

Federal Laboratory Consortium — TheCommunications and Information Sharing (CIS) Laboratory is a Public Safety interoperable communications technology laboratory with analog and digital radios, and...
Memetic algorithms for de novo motif-finding in biomedical sequences.

Science.gov (United States)

Bi, Chengpeng

2012-09-01

The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary micro
Discovering Motifs in Biological Sequences Using the Micron Automata Processor.

Science.gov (United States)

Roy, Indranil; Aluru, Srinivas

2016-01-01

Finding approximately conserved sequences, called motifs, across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the (l, d) motif search problem of identifying one or more motifs of length l present in at least q of the n given sequences, with each occurrence differing from the motif in at most d substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is (26,11). We propose a novel algorithm for the (l,d) motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much larger instances of the (l, d) motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances (39,18) and (40,17). The paper serves as a useful guide to solving problems using this new accelerator technology.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

Directory of Open Access Journals (Sweden)

Rogelio Alcántara-Silva

2017-03-01

Full Text Available Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf .
Parallel motif extraction from very long sequences

KAUST Repository

Sahli, Majed

2013-01-01

Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern applications require mining of motifs in one very long sequence (i.e., in the order of several gigabytes). For this case, there exist statistical approaches that are fast but inaccurate; or combinatorial methods that are sound and complete. Unfortunately, existing combinatorial methods are serial and very slow. Consequently, they are limited to very short sequences (i.e., a few megabytes), small alphabets (typically 4 symbols for DNA sequences), and restricted types of motifs. This paper presents ACME, a combinatorial method for extracting motifs from a single very long sequence. ACME arranges the search space in contiguous blocks that take advantage of the cache hierarchy in modern architectures, and achieves almost an order of magnitude performance gain in serial execution. It also decomposes the search space in a smart way that allows scalability to thousands of processors with more than 90% speedup. ACME is the only method that: (i) scales to gigabyte-long sequences; (ii) handles large alphabets; (iii) supports interesting types of motifs with minimal additional cost; and (iv) is optimized for a variety of architectures such as multi-core systems, clusters in the cloud, and supercomputers. ACME reduces the extraction time for an exact-length query from 4 hours to 7 minutes on a typical workstation; handles 3 orders of magnitude longer sequences; and scales up to 16, 384 cores on a supercomputer. Copyright is held by the owner/author(s).
EIA for mining projects in the CIS

Energy Technology Data Exchange (ETDEWEB)

Coppin, N.J.; Wheeler, P. [Wardell Armstrong, Newcastle under Lyme (United Kingdom)

1996-12-31

This paper examines the Environmental Impact Assessment (EIA) requirements and procedures encountered during work on gold and coal mining projects in Kazakhstan, Uzbekistan and Mongolia. Observations on the implementation of former-Soviet inspired EIA in the Commonwealth of Independent States (CIS), and the differences with North American and European requirements and procedures are highlighted, particularly where these indicate lessons for the West. The main implications for mining companies considering or developing projects in the CIS are discussed, particularly the procedures that have to be followed for environmental permitting. 2 figs.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.