WorldWideScience

Sample records for conserved regulatory sequences

  1. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  2. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    Science.gov (United States)

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  3. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    Directory of Open Access Journals (Sweden)

    Kacy L Gordon

    2015-05-01

    Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  4. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2007-02-01

    Full Text Available Abstract Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes.

  5. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  6. Asymmetrical distribution of non-conserved regulatory sequences at PHOX2B is reflected at the ENCODE loci and illuminates a possible genome-wide trend

    Directory of Open Access Journals (Sweden)

    McCallion Andrew S

    2009-01-01

    Full Text Available Abstract Background Transcriptional regulatory elements are central to development and interspecific phenotypic variation. Current regulatory element prediction tools rely heavily upon conservation for prediction of putative elements. Recent in vitro observations from the ENCODE project combined with in vivo analyses at the zebrafish phox2b locus suggests that a significant fraction of regulatory elements may fall below commonly applied metrics of conservation. We propose to explore these observations in vivo at the human PHOX2B locus, and also evaluate the potential evidence for genome-wide applicability of these observations through a novel analysis of extant data. Results Transposon-based transgenic analysis utilizing a tiling path proximal to human PHOX2B in zebrafish recapitulates the observations at the zebrafish phox2b locus of both conserved and non-conserved regulatory elements. Analysis of human sequences conserved with previously identified zebrafish phox2b regulatory elements demonstrates that the orthologous sequences exhibit overlapping regulatory control. Additionally, analysis of non-conserved sequences scattered over 135 kb 5' to PHOX2B, provides evidence of non-conserved regulatory elements positively biased with close proximity to the gene. Furthermore, we provide a novel analysis of data from the ENCODE project, finding a non-uniform distribution of regulatory elements consistent with our in vivo observations at PHOX2B. These observations remain largely unchanged when one accounts for the sequence repeat content of the assayed intervals, when the intervals are sub-classified by biological role (developmental versus non-developmental, or by gene density (gene desert versus non-gene desert. Conclusion While regulatory elements frequently display evidence of evolutionary conservation, a fraction appears to be undetected by current metrics of conservation. In vivo observations at the PHOX2B locus, supported by our analyses of in

  7. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    Energy Technology Data Exchange (ETDEWEB)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  8. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    Energy Technology Data Exchange (ETDEWEB)

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  9. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    Energy Technology Data Exchange (ETDEWEB)

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  10. Conserved Transcriptional Regulatory Programs Underlying Rice and Barley Germination

    Science.gov (United States)

    Lin, Li; Tian, Shulan; Kaeppler, Shawn; Liu, Zongrang; An, Yong-Qiang (Charles)

    2014-01-01

    Germination is a biological process important to plant development and agricultural production. Barley and rice diverged 50 million years ago, but share a similar germination process. To gain insight into the conservation of their underlying gene regulatory programs, we compared transcriptomes of barley and rice at start, middle and end points of germination, and revealed that germination regulated barley and rice genes (BRs) diverged significantly in expression patterns and/or protein sequences. However, BRs with higher protein sequence similarity tended to have more conserved expression patterns. We identified and characterized 316 sets of conserved barley and rice genes (cBRs) with high similarity in both protein sequences and expression patterns, and provided a comprehensive depiction of the transcriptional regulatory program conserved in barley and rice germination at gene, pathway and systems levels. The cBRs encoded proteins involved in a variety of biological pathways and had a wide range of expression patterns. The cBRs encoding key regulatory components in signaling pathways often had diverse expression patterns. Early germination up-regulation of cell wall metabolic pathway and peroxidases, and late germination up-regulation of chromatin structure and remodeling pathways were conserved in both barley and rice. Protein sequence and expression pattern of a gene change quickly if it is not subjected to a functional constraint. Preserving germination-regulated expression patterns and protein sequences of those cBRs for 50 million years strongly suggests that the cBRs are functionally significant and equivalent in germination, and contribute to the ancient characteristics of germination preserved in barley and rice. The functional significance and equivalence of the cBR genes predicted here can serve as a foundation to further characterize their biological functions and facilitate bridging rice and barley germination research with greater confidence. PMID

  11. Spatially conserved regulatory elements identified within human and mouse Cd247 gene using high-throughput sequencing data from the ENCODE project

    DEFF Research Database (Denmark)

    Pundhir, Sachin; Hannibal, Tine Dahlbæk; Bang-Berthelsen, Claus Heiner

    2014-01-01

    . In this study, we have utilized the wealth of high-throughput sequencing data produced during the Encyclopedia of DNA Elements (ENCODE) project to identify spatially conserved regulatory elements within the Cd247 gene from human and mouse. We show the presence of two transcription factor binding sites...

  12. Pleiotropy constrains the evolution of protein but not regulatory sequences in a transcription regulatory network influencing complex social behaviours

    Directory of Open Access Journals (Sweden)

    Daria eMolodtsova

    2014-12-01

    Full Text Available It is increasingly apparent that genes and networks that influence complex behaviour are evolutionary conserved, which is paradoxical considering that behaviour is labile over evolutionary timescales. How does adaptive change in behaviour arise if behaviour is controlled by conserved, pleiotropic, and likely evolutionary constrained genes? Pleiotropy and connectedness are known to constrain the general rate of protein evolution, prompting some to suggest that the evolution of complex traits, including behaviour, is fuelled by regulatory sequence evolution. However, we seldom have data on the strength of selection on mutations in coding and regulatory sequences, and this hinders our ability to study how pleiotropy influences coding and regulatory sequence evolution. Here we use population genomics to estimate the strength of selection on coding and regulatory mutations for a transcriptional regulatory network that influences complex behaviour of honey bees. We found that replacement mutations in highly connected transcription factors and target genes experience significantly stronger negative selection relative to weakly connected transcription factors and targets. Adaptively evolving proteins were significantly more likely to reside at the periphery of the regulatory network, while proteins with signs of negative selection were near the core of the network. Interestingly, connectedness and network structure had minimal influence on the strength of selection on putative regulatory sequences for both transcription factors and their targets. Our study indicates that adaptive evolution of complex behaviour can arise because of positive selection on protein-coding mutations in peripheral genes, and on regulatory sequence mutations in both transcription factors and their targets throughout the network.

  13. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    Energy Technology Data Exchange (ETDEWEB)

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  14. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  15. Mutations in the newly identified RAX regulatory sequence are not a frequent cause of micro/anophthalmia.

    Science.gov (United States)

    Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick

    2009-06-01

    Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.

  16. Mapping cis-Regulatory Domains in the Human Genome UsingMulti-Species Conservation of Synteny

    Energy Technology Data Exchange (ETDEWEB)

    Ahituv, Nadav; Prabhakar, Shyam; Poulin, Francis; Rubin, EdwardM.; Couronne, Olivier

    2005-06-13

    Our inability to associate distant regulatory elements with the genes that they regulate has largely precluded their examination for sequence alterations contributing to human disease. One major obstacle is the large genomic space surrounding targeted genes in which such elements could potentially reside. In order to delineate gene regulatory boundaries we used whole-genome human-mouse-chicken (HMC) and human-mouse-frog (HMF) multiple alignments to compile conserved blocks of synteny (CBS), under the hypothesis that these blocks have been kept intact throughout evolution at least in part by the requirement of regulatory elements to stay linked to the genes that they regulate. A total of 2,116 and 1,942 CBS>200 kb were assembled for HMC and HMF respectively, encompassing 1.53 and 0.86 Gb of human sequence. To support the existence of complex long-range regulatory domains within these CBS we analyzed the prevalence and distribution of chromosomal aberrations leading to position effects (disruption of a genes regulatory environment), observing a clear bias not only for mapping onto CBS but also for longer CBS size. Our results provide a genome wide data set characterizing the regulatory domains of genes and the conserved regulatory elements within them.

  17. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

    Science.gov (United States)

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-12-01

    The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  18. Highly conserved non-coding sequences are associated with vertebrate development.

    Directory of Open Access Journals (Sweden)

    Adam Woolfe

    2005-01-01

    Full Text Available In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH, in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development

  19. Structural and functional analysis of mouse Msx1 gene promoter: sequence conservation with human MSX1 promoter points at potential regulatory elements.

    Science.gov (United States)

    Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E

    1998-06-01

    Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.

  20. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    Science.gov (United States)

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. Sequence conservation and combinatorial complexity of Drosophila neural precursor cell enhancers

    Directory of Open Access Journals (Sweden)

    Kuzin Alexander

    2008-08-01

    Full Text Available Abstract Background The presence of highly conserved sequences within cis-regulatory regions can serve as a valuable starting point for elucidating the basis of enhancer function. This study focuses on regulation of gene expression during the early events of Drosophila neural development. We describe the use of EvoPrinter and cis-Decoder, a suite of interrelated phylogenetic footprinting and alignment programs, to characterize highly conserved sequences that are shared among co-regulating enhancers. Results Analysis of in vivo characterized enhancers that drive neural precursor gene expression has revealed that they contain clusters of highly conserved sequence blocks (CSBs made up of shorter shared sequence elements which are present in different combinations and orientations within the different co-regulating enhancers; these elements contain either known consensus transcription factor binding sites or consist of novel sequences that have not been functionally characterized. The CSBs of co-regulated enhancers share a large number of sequence elements, suggesting that a diverse repertoire of transcription factors may interact in a highly combinatorial fashion to coordinately regulate gene expression. We have used information gained from our comparative analysis to discover an enhancer that directs expression of the nervy gene in neural precursor cells of the CNS and PNS. Conclusion The combined use EvoPrinter and cis-Decoder has yielded important insights into the combinatorial appearance of fundamental sequence elements required for neural enhancer function. Each of the 30 enhancers examined conformed to a pattern of highly conserved blocks of sequences containing shared constituent elements. These data establish a basis for further analysis and understanding of neural enhancer function.

  2. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  3. Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

    Science.gov (United States)

    Meier, Daniel; Schindler, Detlev

    2011-01-01

    The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.

  4. Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

    Directory of Open Access Journals (Sweden)

    Daniel Meier

    Full Text Available The Fanconi anemia (FA gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS. In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs, and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.

  5. Identification of putative regulatory upstream ORFs in the yeast genome using heuristics and evolutionary conservation

    Directory of Open Access Journals (Sweden)

    Bilsland Elizabeth

    2007-08-01

    Full Text Available Abstract Background The translational efficiency of an mRNA can be modulated by upstream open reading frames (uORFs present in certain genes. A uORF can attenuate translation of the main ORF by interfering with translational reinitiation at the main start codon. uORFs also occur by chance in the genome, in which case they do not have a regulatory role. Since the sequence determinants for functional uORFs are not understood, it is difficult to discriminate functional from spurious uORFs by sequence analysis. Results We have used comparative genomics to identify novel uORFs in yeast with a high likelihood of having a translational regulatory role. We examined uORFs, previously shown to play a role in regulation of translation in Saccharomyces cerevisiae, for evolutionary conservation within seven Saccharomyces species. Inspection of the set of conserved uORFs yielded the following three characteristics useful for discrimination of functional from spurious uORFs: a length between 4 and 6 codons, a distance from the start of the main ORF between 50 and 150 nucleotides, and finally a lack of overlap with, and clear separation from, neighbouring uORFs. These derived rules are inherently associated with uORFs with properties similar to the GCN4 locus, and may not detect most uORFs of other types. uORFs with high scores based on these rules showed a much higher evolutionary conservation than randomly selected uORFs. In a genome-wide scan in S. cerevisiae, we found 34 conserved uORFs from 32 genes that we predict to be functional; subsequent analysis showed the majority of these to be located within transcripts. A total of 252 genes were found containing conserved uORFs with properties indicative of a functional role; all but 7 are novel. Functional content analysis of this set identified an overrepresentation of genes involved in transcriptional control and development. Conclusion Evolutionary conservation of uORFs in yeasts can be traced up to 100

  6. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.

    Science.gov (United States)

    Fuentes-Pardo, Angela P; Ruzzante, Daniel E

    2017-10-01

    Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.

  7. High-throughput sequencing, characterization and detection of new and conserved cucumber miRNAs.

    Directory of Open Access Journals (Sweden)

    Germán Martínez

    Full Text Available Micro RNAS (miRNAs are a class of endogenous small non coding RNAs involved in the post-transcriptional regulation of gene expression. In plants, a great number of conserved and specific miRNAs, mainly arising from model species, have been identified to date. However less is known about the diversity of these regulatory RNAs in vegetal species with agricultural and/or horticultural importance. Here we report a combined approach of bioinformatics prediction, high-throughput sequencing data and molecular methods to analyze miRNAs populations in cucumber (Cucumis sativus plants. A set of 19 conserved and 6 known but non-conserved miRNA families were found in our cucumber small RNA dataset. We also identified 7 (3 with their miRNA* strand not previously described miRNAs, candidates to be cucumber-specific. To validate their description these new C. sativus miRNAs were detected by northern blot hybridization. Additionally, potential targets for most conserved and new miRNAs were identified in cucumber genome.In summary, in this study we have identified, by first time, conserved, known non-conserved and new miRNAs arising from an agronomically important species such as C. sativus. The detection of this complex population of regulatory small RNAs suggests that similarly to that observe in other plant species, cucumber miRNAs may possibly play an important role in diverse biological and metabolic processes.

  8. Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes

    Directory of Open Access Journals (Sweden)

    Maggi Giorgio P

    2008-06-01

    Full Text Available Abstract Background The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent on the availability of annotated proteins. Results In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.

  9. HBVRegDB: Annotation, comparison, detection and visualization of regulatory elements in hepatitis B virus sequences

    Directory of Open Access Journals (Sweden)

    Firth Andrew E

    2007-12-01

    Full Text Available Abstract Background The many Hepadnaviridae sequences available have widely varied functional annotation. The genomes are very compact (~3.2 kb but contain multiple layers of functional regulatory elements in addition to coding regions. Key regions are subject to purifying selection, as mutations in these regions will produce non-functional viruses. Results These genomic sequences have been organized into a structured database to facilitate research at the molecular level. HBVRegDB is a comparative genomic analysis tool with an integrated underlying sequence database. The database contains genomic sequence data from representative viruses. In addition to INSDC and RefSeq annotation, HBVRegDB also contains expert and systematically calculated annotations (e.g. promoters and comparative genome analysis results (e.g. blastn, tblastx. It also contains analyses based on curated HBV alignments. Information about conserved regions – including primary conservation (e.g. CDS-Plotcon and RNA secondary structure predictions (e.g. Alidot – is integrated into the database. A large amount of data is graphically presented using the GBrowse (Generic Genome Browser adapted for analysis of viral genomes. Flexible query access is provided based on any annotated genomic feature. Novel regulatory motifs can be found by analysing the annotated sequences. Conclusion HBVRegDB serves as a knowledge database and as a comparative genomic analysis tool for molecular biologists investigating HBV. It is publicly available and complementary to other viral and HBV focused datasets and tools http://hbvregdb.otago.ac.nz. The availability of multiple and highly annotated sequences of viral genomes in one database combined with comparative analysis tools facilitates detection of novel genomic elements.

  10. Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata

    Directory of Open Access Journals (Sweden)

    Yu Huaping

    2010-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs play a critical role in post-transcriptional gene regulation and have been shown to control many genes involved in various biological and metabolic processes. There have been extensive studies to discover miRNAs and analyze their functions in model plant species, such as Arabidopsis and rice. Deep sequencing technologies have facilitated identification of species-specific or lowly expressed as well as conserved or highly expressed miRNAs in plants. Results In this research, we used Solexa sequencing to discover new microRNAs in trifoliate orange (Citrus trifoliata which is an important rootstock of citrus. A total of 13,106,753 reads representing 4,876,395 distinct sequences were obtained from a short RNA library generated from small RNA extracted from C. trifoliata flower and fruit tissues. Based on sequence similarity and hairpin structure prediction, we found that 156,639 reads representing 63 sequences from 42 highly conserved miRNA families, have perfect matches to known miRNAs. We also identified 10 novel miRNA candidates whose precursors were all potentially generated from citrus ESTs. In addition, five miRNA* sequences were also sequenced. These sequences had not been earlier described in other plant species and accumulation of the 10 novel miRNAs were confirmed by qRT-PCR analysis. Potential target genes were predicted for most conserved and novel miRNAs. Moreover, four target genes including one encoding IRX12 copper ion binding/oxidoreductase and three genes encoding NB-LRR disease resistance protein have been experimentally verified by detection of the miRNA-mediated mRNA cleavage in C. trifoliata. Conclusion Deep sequencing of short RNAs from C. trifoliata flowers and fruits identified 10 new potential miRNAs and 42 highly conserved miRNA families, indicating that specific miRNAs exist in C. trifoliata. These results show that regulatory miRNAs exist in agronomically important trifoliate orange

  11. Human developmental enhancers conserved between deuterostomes and protostomes.

    Directory of Open Access Journals (Sweden)

    Shoa L Clarke

    Full Text Available The identification of homologies, whether morphological, molecular, or genetic, is fundamental to our understanding of common biological principles. Homologies bridging the great divide between deuterostomes and protostomes have served as the basis for current models of animal evolution and development. It is now appreciated that these two clades share a common developmental toolkit consisting of conserved transcription factors and signaling pathways. These patterning genes sometimes show common expression patterns and genetic interactions, suggesting the existence of similar or even conserved regulatory apparatus. However, previous studies have found no regulatory sequence conserved between deuterostomes and protostomes. Here we describe the first such enhancers, which we call bilaterian conserved regulatory elements (Bicores. Bicores show conservation of sequence and gene synteny. Sequence conservation of Bicores reflects conserved patterns of transcription factor binding sites. We predict that Bicores act as response elements to signaling pathways, and we show that Bicores are developmental enhancers that drive expression of transcriptional repressors in the vertebrate central nervous system. Although the small number of identified Bicores suggests extensive rewiring of cis-regulation between the protostome and deuterostome clades, additional Bicores may be revealed as our understanding of cis-regulatory logic and sample of bilaterian genomes continue to grow.

  12. Phylogeny based discovery of regulatory elements

    Directory of Open Access Journals (Sweden)

    Cohen Barak A

    2006-05-01

    Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.

  13. The relationship of protein conservation and sequence length

    Directory of Open Access Journals (Sweden)

    Panchenko Anna R

    2002-11-01

    Full Text Available Abstract Background In general, the length of a protein sequence is determined by its function and the wide variance in the lengths of an organism's proteins reflects the diversity of specific functional roles for these proteins. However, additional evolutionary forces that affect the length of a protein may be revealed by studying the length distributions of proteins evolving under weaker functional constraints. Results We performed sequence comparisons to distinguish highly conserved and poorly conserved proteins from the bacterium Escherichia coli, the archaeon Archaeoglobus fulgidus, and the eukaryotes Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. For all organisms studied, the conserved and nonconserved proteins have strikingly different length distributions. The conserved proteins are, on average, longer than the poorly conserved ones, and the length distributions for the poorly conserved proteins have a relatively narrow peak, in contrast to the conserved proteins whose lengths spread over a wider range of values. For the two prokaryotes studied, the poorly conserved proteins approximate the minimal length distribution expected for a diverse range of structural folds. Conclusions There is a relationship between protein conservation and sequence length. For all the organisms studied, there seems to be a significant evolutionary trend favoring shorter proteins in the absence of other, more specific functional constraints.

  14. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  15. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    Science.gov (United States)

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.

  16. A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

    Directory of Open Access Journals (Sweden)

    Tony Håndstad

    Full Text Available BACKGROUND: Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial. RESULTS: Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods. CONCLUSIONS: Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites.

  17. Genomic dissection of conserved transcriptional regulation in intestinal epithelial cells.

    Directory of Open Access Journals (Sweden)

    Colin R Lickwar

    2017-08-01

    Full Text Available The intestinal epithelium serves critical physiologic functions that are shared among all vertebrates. However, it is unknown how the transcriptional regulatory mechanisms underlying these functions have changed over the course of vertebrate evolution. We generated genome-wide mRNA and accessible chromatin data from adult intestinal epithelial cells (IECs in zebrafish, stickleback, mouse, and human species to determine if conserved IEC functions are achieved through common transcriptional regulation. We found evidence for substantial common regulation and conservation of gene expression regionally along the length of the intestine from fish to mammals and identified a core set of genes comprising a vertebrate IEC signature. We also identified transcriptional start sites and other putative regulatory regions that are differentially accessible in IECs in all 4 species. Although these sites rarely showed sequence conservation from fish to mammals, surprisingly, they drove highly conserved IEC expression in a zebrafish reporter assay. Common putative transcription factor binding sites (TFBS found at these sites in multiple species indicate that sequence conservation alone is insufficient to identify much of the functionally conserved IEC regulatory information. Among the rare, highly sequence-conserved, IEC-specific regulatory regions, we discovered an ancient enhancer upstream from her6/HES1 that is active in a distinct population of Notch-positive cells in the intestinal epithelium. Together, these results show how combining accessible chromatin and mRNA datasets with TFBS prediction and in vivo reporter assays can reveal tissue-specific regulatory information conserved across 420 million years of vertebrate evolution. We define an IEC transcriptional regulatory network that is shared between fish and mammals and establish an experimental platform for studying how evolutionarily distilled regulatory information commonly controls IEC development

  18. Sequencing Conservation Actions Through Threat Assessments in the Southeastern United States

    Science.gov (United States)

    Robert D. Sutter; Christopher C. Szell

    2006-01-01

    The identification of conservation priorities is one of the leading issues in conservation biology. We present a project of The Nature Conservancy, called Sequencing Conservation Actions, which prioritizes conservation areas and identifies foci for crosscutting strategies at various geographic scales. We use the term “Sequencing” to mean an ordering of actions over...

  19. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

    LENUS (Irish Health Repository)

    Ivanov, Ivaylo P

    2011-05-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5\\' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5\\' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

  20. On the relationship between residue structural environment and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  1. Statistical approaches to use a model organism for regulatory sequences annotation of newly sequenced species.

    Directory of Open Access Journals (Sweden)

    Pietro Liò

    Full Text Available A major goal of bioinformatics is the characterization of transcription factors and the transcriptional programs they regulate. Given the speed of genome sequencing, we would like to quickly annotate regulatory sequences in newly-sequenced genomes. In such cases, it would be helpful to predict sequence motifs by using experimental data from closely related model organism. Here we present a general algorithm that allow to identify transcription factor binding sites in one newly sequenced species by performing Bayesian regression on the annotated species. First we set the rationale of our method by applying it within the same species, then we extend it to use data available in closely related species. Finally, we generalise the method to handle the case when a certain number of experiments, from several species close to the species on which to make inference, are available. In order to show the performance of the method, we analyse three functionally related networks in the Ascomycota. Two gene network case studies are related to the G2/M phase of the Ascomycota cell cycle; the third is related to morphogenesis. We also compared the method with MatrixReduce and discuss other types of validation and tests. The first network is well known and provides a biological validation test of the method. The two cell cycle case studies, where the gene network size is conserved, demonstrate an effective utility in annotating new species sequences using all the available replicas from model species. The third case, where the gene network size varies among species, shows that the combination of information is less powerful but is still informative. Our methodology is quite general and could be extended to integrate other high-throughput data from model organisms.

  2. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    Science.gov (United States)

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  3. ChIP-Seq-Annotated Heliconius erato Genome Highlights Patterns of cis-Regulatory Evolution in Lepidoptera

    Directory of Open Access Journals (Sweden)

    James J. Lewis

    2016-09-01

    Full Text Available Uncovering phylogenetic patterns of cis-regulatory evolution remains a fundamental goal for evolutionary and developmental biology. Here, we characterize the evolution of regulatory loci in butterflies and moths using chromatin immunoprecipitation sequencing (ChIP-seq annotation of regulatory elements across three stages of head development. In the process we provide a high-quality, functionally annotated genome assembly for the butterfly, Heliconius erato. Comparing cis-regulatory element conservation across six lepidopteran genomes, we find that regulatory sequences evolve at a pace similar to that of protein-coding regions. We also observe that elements active at multiple developmental stages are markedly more conserved than elements with stage-specific activity. Surprisingly, we also find that stage-specific proximal and distal regulatory elements evolve at nearly identical rates. Our study provides a benchmark for genome-wide patterns of regulatory element evolution in insects, and it shows that developmental timing of activity strongly predicts patterns of regulatory sequence evolution.

  4. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    Science.gov (United States)

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  5. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    Science.gov (United States)

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to

  6. Rare and common regulatory variation in population-scale sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Stephen B Montgomery

    2011-07-01

    Full Text Available Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.

  7. Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

    Science.gov (United States)

    Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James

    2015-02-14

    Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the

  8. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    Science.gov (United States)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  9. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    Directory of Open Access Journals (Sweden)

    Lynch Michael

    2010-05-01

    Full Text Available Abstract Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1 shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2 are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3 reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  10. Mechanistically Distinct Pathways of Divergent Regulatory DNA Creation Contribute to Evolution of Human-Specific Genomic Regulatory Networks Driving Phenotypic Divergence of Homo sapiens.

    Science.gov (United States)

    Glinsky, Gennadi V

    2016-09-19

    Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of

  11. A method for selecting cis-acting regulatory sequences that respond to small molecule effectors

    Directory of Open Access Journals (Sweden)

    Allas Ülar

    2010-08-01

    Full Text Available Abstract Background Several cis-acting regulatory sequences functioning at the level of mRNA or nascent peptide and specifically influencing transcription or translation have been described. These regulatory elements often respond to specific chemicals. Results We have developed a method that allows us to select cis-acting regulatory sequences that respond to diverse chemicals. The method is based on the β-lactamase gene containing a random sequence inserted into the beginning of the ORF. Several rounds of selection are used to isolate sequences that suppress β-lactamase expression in response to the compound under study. We have isolated sequences that respond to erythromycin, troleandomycin, chloramphenicol, meta-toluate and homoserine lactone. By introducing synonymous and non-synonymous mutations we have shown that at least in the case of erythromycin the sequences act at the peptide level. We have also tested the cross-activities of the constructs and found that in most cases the sequences respond most strongly to the compound on which they were isolated. Conclusions Several selected peptides showed ligand-specific changes in amino acid frequencies, but no consensus motif could be identified. This is consistent with previous observations on natural cis-acting peptides, showing that it is often impossible to demonstrate a consensus. Applying the currently developed method on a larger scale, by selecting and comparing an extended set of sequences, might allow the sequence rules underlying the activity of cis-acting regulatory peptides to be identified.

  12. Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells.

    Directory of Open Access Journals (Sweden)

    Marica Grskovic

    2007-08-01

    Full Text Available Understanding the transcriptional regulation of pluripotent cells is of fundamental interest and will greatly inform efforts aimed at directing differentiation of embryonic stem (ES cells or reprogramming somatic cells. We first analyzed the transcriptional profiles of mouse ES cells and primordial germ cells and identified genes upregulated in pluripotent cells both in vitro and in vivo. These genes are enriched for roles in transcription, chromatin remodeling, cell cycle, and DNA repair. We developed a novel computational algorithm, CompMoby, which combines analyses of sequences both aligned and non-aligned between different genomes with a probabilistic segmentation model to systematically predict short DNA motifs that regulate gene expression. CompMoby was used to identify conserved overrepresented motifs in genes upregulated in pluripotent cells. We show that the motifs are preferentially active in undifferentiated mouse ES and embryonic germ cells in a sequence-specific manner, and that they can act as enhancers in the context of an endogenous promoter. Importantly, the activity of the motifs is conserved in human ES cells. We further show that the transcription factor NF-Y specifically binds to one of the motifs, is differentially expressed during ES cell differentiation, and is required for ES cell proliferation. This study provides novel insights into the transcriptional regulatory networks of pluripotent cells. Our results suggest that this systematic approach can be broadly applied to understanding transcriptional networks in mammalian species.

  13. Alignment and prediction of cis-regulatory modules based on a probabilistic model of evolution.

    Directory of Open Access Journals (Sweden)

    Xin He

    2009-03-01

    Full Text Available Cross-species comparison has emerged as a powerful paradigm for predicting cis-regulatory modules (CRMs and understanding their evolution. The comparison requires reliable sequence alignment, which remains a challenging task for less conserved noncoding sequences. Furthermore, the existing models of DNA sequence evolution generally do not explicitly treat the special properties of CRM sequences. To address these limitations, we propose a model of CRM evolution that captures different modes of evolution of functional transcription factor binding sites (TFBSs and the background sequences. A particularly novel aspect of our work is a probabilistic model of gains and losses of TFBSs, a process being recognized as an important part of regulatory sequence evolution. We present a computational framework that uses this model to solve the problems of CRM alignment and prediction. Our alignment method is similar to existing methods of statistical alignment but uses the conserved binding sites to improve alignment. Our CRM prediction method deals with the inherent uncertainties of binding site annotations and sequence alignment in a probabilistic framework. In simulated as well as real data, we demonstrate that our program is able to improve both alignment and prediction of CRM sequences over several state-of-the-art methods. Finally, we used alignments produced by our program to study binding site conservation in genome-wide binding data of key transcription factors in the Drosophila blastoderm, with two intriguing results: (i the factor-bound sequences are under strong evolutionary constraints even if their neighboring genes are not expressed in the blastoderm and (ii binding sites in distal bound sequences (relative to transcription start sites tend to be more conserved than those in proximal regions. Our approach is implemented as software, EMMA (Evolutionary Model-based cis-regulatory Module Analysis, ready to be applied in a broad biological context.

  14. Molecular evidence for increased regulatory conservation during metamorphosis, and against deleterious cascading effects of hybrid breakdown in Drosophila

    Directory of Open Access Journals (Sweden)

    Artieri Carlo G

    2010-03-01

    Full Text Available Abstract Background Speculation regarding the importance of changes in gene regulation in determining major phylogenetic patterns continues to accrue, despite a lack of broad-scale comparative studies examining how patterns of gene expression vary during development. Comparative transcriptional profiling of adult interspecific hybrids and their parental species has uncovered widespread divergence of the mechanisms controlling gene regulation, revealing incompatibilities that are masked in comparisons between the pure species. However, this has prompted the suggestion that misexpression in adult hybrids results from the downstream cascading effects of a subset of genes improperly regulated in early development. Results We sought to determine how gene expression diverges over development, as well as test the cascade hypothesis, by profiling expression in males of Drosophila melanogaster, D. sechellia, and D. simulans, as well as the D. simulans (♀ × D. sechellia (♂ male F1 hybrids, at four different developmental time points (3rd instar larval, early pupal, late pupal, and newly-emerged adult. Contrary to the cascade model of misexpression, we find that there is considerable stage-specific autonomy of regulatory breakdown in hybrids, with the larval and adult stages showing significantly more hybrid misexpression as compared to the pupal stage. However, comparisons between pure species indicate that genes expressed during earlier stages of development tend to be more conserved in terms of their level of expression than those expressed during later stages, suggesting that while Von Baer's famous law applies at both the level of nucleotide sequence and expression, it may not apply necessarily to the underlying overall regulatory network, which appears to diverge over the course of ontogeny and which can only be ascertained by combining divergent genomes in species hybrids. Conclusion Our results suggest that complex integration of regulatory

  15. Functional dissection of the promoter of the pollen-specific gene NTP303 reveals a novel pollen-specific, and conserved cis-regulatory element.

    Science.gov (United States)

    Weterings, K; Schrauwen, J; Wullems, G; Twell, D

    1995-07-01

    Regulatory elements within the promoter of the pollen-specific NTP303 gene from tobacco were analysed by transient and stable expression analyses. Analysis of precisely targeted mutations showed that the NTP303 promoter is not regulated by any of the previously described pollen-specific cis-regulatory elements. However, two adjacent regions from -103 to -86 bp and from -86 to -59 bp were shown to contain sequences which positively regulated the NTP303 promoter. Both of these regions were capable of driving pollen-specific expression from a heterologous promoter, independent of orientation and in an additive manner. The boundaries of the minimal, functional NTP303 promoter were determined to lie within the region -86 to -51 bp. The sequence AAATGA localized from -94 to -89 bp was identified as a novel cis-acting element, of which the TGA triplet was shown to comprise an active part. This element was shown to be completely conserved in the similarly regulated promoter of the Bp 10 gene from Brassica napus encoding a homologue of the NTP303 gene.

  16. Peptomics, identification of novel cationic Arabidopsis peptides with conserved sequence motifs

    DEFF Research Database (Denmark)

    Olsen, Addie Nina; Mundy, John; Skriver, Karen

    2002-01-01

    Arabidopsis family of 34 genes. The predicted peptides are characterized by a conserved C-terminal sequence motif and additional primary structure conservation in a core region. The majority of these genes had not previously been annotated. A subset of the predicted peptides show high overall sequence...... similarity to Rapid Alkalinization Factor (RALF), a peptide isolated from tobacco. We therefore refer to this peptide family as RALFL for RALF-Like. RT-PCR analysis confirmed that several of the Arabidopsis genes are expressed and that their expression patterns vary. The identification of a large gene family...

  17. Genes involved in complex adaptive processes tend to have highly conserved upstream regions in mammalian genomes

    Directory of Open Access Journals (Sweden)

    Kohane Isaac

    2005-11-01

    Full Text Available Abstract Background Recent advances in genome sequencing suggest a remarkable conservation in gene content of mammalian organisms. The similarity in gene repertoire present in different organisms has increased interest in studying regulatory mechanisms of gene expression aimed at elucidating the differences in phenotypes. In particular, a proximal promoter region contains a large number of regulatory elements that control the expression of its downstream gene. Although many studies have focused on identification of these elements, a broader picture on the complexity of transcriptional regulation of different biological processes has not been addressed in mammals. The regulatory complexity may strongly correlate with gene function, as different evolutionary forces must act on the regulatory systems under different biological conditions. We investigate this hypothesis by comparing the conservation of promoters upstream of genes classified in different functional categories. Results By conducting a rank correlation analysis between functional annotation and upstream sequence alignment scores obtained by human-mouse and human-dog comparison, we found a significantly greater conservation of the upstream sequence of genes involved in development, cell communication, neural functions and signaling processes than those involved in more basic processes shared with unicellular organisms such as metabolism and ribosomal function. This observation persists after controlling for G+C content. Considering conservation as a functional signature, we hypothesize a higher density of cis-regulatory elements upstream of genes participating in complex and adaptive processes. Conclusion We identified a class of functions that are associated with either high or low promoter conservation in mammals. We detected a significant tendency that points to complex and adaptive processes were associated with higher promoter conservation, despite the fact that they have emerged

  18. A conserved RNA structural element within the hepatitis B virus post-transcriptional regulatory element enhance nuclear export of intronless transcripts and repress the splicing mechanism.

    Science.gov (United States)

    Visootsat, Akasit; Payungporn, Sunchai; T-Thienprasert, Nattanan P

    2015-12-01

    Hepatitis B virus (HBV) infection is a primary cause of hepatocellular carcinoma and liver cirrhosis worldwide. To develop novel antiviral drugs, a better understanding of HBV gene expression regulation is vital. One important aspect is to understand how HBV hijacks the cellular machinery to export unspliced RNA from the nucleus. The HBV post-transcriptional regulatory element (HBV PRE) has been proposed to be the HBV RNA nuclear export element. However, the function remains controversial, and the core element is unclear. This study, therefore, aimed to identify functional regulatory elements within the HBV PRE and investigate their functions. Using bioinformatics programs based on sequence conservation and conserved RNA secondary structures, three regulatory elements were predicted, namely PRE 1151-1410, PRE 1520-1620 and PRE 1650-1684. PRE 1151-1410 significantly increased intronless and unspliced luciferase activity in both HepG2 and COS-7 cells. Likewise, PRE 1151-1410 significantly elevated intronless and unspliced HBV surface transcripts in liver cancer cells. Moreover, motif analysis predicted that PRE 1151-1410 contains several regulatory motifs. This study reported the roles of PRE 1151-1410 in intronless transcript nuclear export and the splicing mechanism. Additionally, these results provide knowledge in the field of HBV RNA regulation. Moreover, PRE 1151-1410 may be used to enhance the expression of other mRNAs in intronless reporter plasmids.

  19. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    Science.gov (United States)

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  20. Validation of skeletal muscle cis-regulatory module predictions reveals nucleotide composition bias in functional enhancers.

    Directory of Open Access Journals (Sweden)

    Andrew T Kwon

    2011-12-01

    Full Text Available We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions.

  1. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  2. An in vivo cis-regulatory screen at the type 2 diabetes associated TCF7L2 locus identifies multiple tissue-specific enhancers.

    Directory of Open Access Journals (Sweden)

    Daniel Savic

    Full Text Available Genome-wide association studies (GWAS have repeatedly shown an association between non-coding variants in the TCF7L2 locus and risk for type 2 diabetes (T2D, implicating a role for cis-regulatory variation within this locus in disease etiology. Supporting this hypothesis, we previously localized complex regulatory activity to the TCF7L2 T2D-associated interval using an in vivo bacterial artificial chromosome (BAC enhancer-trapping reporter strategy. To follow-up on this broad initial survey of the TCF7L2 regulatory landscape, we performed a fine-mapping enhancer scan using in vivo mouse transgenic reporter assays. We functionally interrogated approximately 50% of the sequences within the T2D-associated interval, utilizing sequence conservation within this 92-kb interval to determine the regulatory potential of all evolutionary conserved sequences that exhibited conservation to the non-eutherian mammal opossum. Included in this study was a detailed functional interrogation of sequences spanning both protective and risk alleles of single nucleotide polymorphism (SNP rs7903146, which has exhibited allele-specific enhancer function in pancreatic beta cells. Using these assays, we identified nine segments regulating various aspects of the TCF7L2 expression profile and that constitute nearly 70% of the sequences tested. These results highlight the regulatory complexity of this interval and support the notion that a TCF7L2 cis-regulatory disruption leads to T2D predisposition.

  3. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Science.gov (United States)

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  4. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Directory of Open Access Journals (Sweden)

    Fauteux François

    2009-10-01

    Full Text Available Abstract Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP gene promoters from three plant families, namely Brassicaceae (mustards, Fabaceae (legumes and Poaceae (grasses using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L. Heynh., soybean (Glycine max (L. Merr. and rice (Oryza sativa L. respectively. We have identified three conserved motifs (two RY-like and one ACGT-like in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination

  5. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    DEFF Research Database (Denmark)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.

    2005-01-01

    years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences......We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each...... between the species-but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence...

  6. Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila.

    Directory of Open Access Journals (Sweden)

    Margaret C W Ho

    2009-11-01

    Full Text Available It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs. How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules.

  7. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence.

    Science.gov (United States)

    Benko, Sabina; Fantes, Judy A; Amiel, Jeanne; Kleinjan, Dirk-Jan; Thomas, Sophie; Ramsay, Jacqueline; Jamshidi, Negar; Essafi, Abdelkader; Heaney, Simon; Gordon, Christopher T; McBride, David; Golzio, Christelle; Fisher, Malcolm; Perry, Paul; Abadie, Véronique; Ayuso, Carmen; Holder-Espinasse, Muriel; Kilpatrick, Nicky; Lees, Melissa M; Picard, Arnaud; Temple, I Karen; Thomas, Paul; Vazquez, Marie-Paule; Vekemans, Michel; Roest Crollius, Hugues; Hastie, Nicholas D; Munnich, Arnold; Etchevers, Heather C; Pelet, Anna; Farlie, Peter G; Fitzpatrick, David R; Lyonnet, Stanislas

    2009-03-01

    Pierre Robin sequence (PRS) is an important subgroup of cleft palate. We report several lines of evidence for the existence of a 17q24 locus underlying PRS, including linkage analysis results, a clustering of translocation breakpoints 1.06-1.23 Mb upstream of SOX9, and microdeletions both approximately 1.5 Mb centromeric and approximately 1.5 Mb telomeric of SOX9. We have also identified a heterozygous point mutation in an evolutionarily conserved region of DNA with in vitro and in vivo features of a developmental enhancer. This enhancer is centromeric to the breakpoint cluster and maps within one of the microdeletion regions. The mutation abrogates the in vitro enhancer function and alters binding of the transcription factor MSX1 as compared to the wild-type sequence. In the developing mouse mandible, the 3-Mb region bounded by the microdeletions shows a regionally specific chromatin decompaction in cells expressing Sox9. Some cases of PRS may thus result from developmental misexpression of SOX9 due to disruption of very-long-range cis-regulatory elements.

  8. Transcriptional activation signals found in the Epstein-Barr virus (EBV) latency C promoter are conserved in the latency C promoter sequences from baboon and Rhesus monkey EBV-like lymphocryptoviruses (cercopithicine herpesviruses 12 and 15).

    Science.gov (United States)

    Fuentes-Pananá, E M; Swaminathan, S; Ling, P D

    1999-01-01

    The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.

  9. Selective constraints in experimentally defined primate regulatory regions.

    Directory of Open Access Journals (Sweden)

    Daniel J Gaffney

    2008-08-01

    Full Text Available Changes in gene regulation may be important in evolution. However, the evolutionary properties of regulatory mutations are currently poorly understood. This is partly the result of an incomplete annotation of functional regulatory DNA in many species. For example, transcription factor binding sites (TFBSs, a major component of eukaryotic regulatory architecture, are typically short, degenerate, and therefore difficult to differentiate from randomly occurring, nonfunctional sequences. Furthermore, although sites such as TFBSs can be computationally predicted using evolutionary conservation as a criterion, estimates of the true level of selective constraint (defined as the fraction of strongly deleterious mutations occurring at a locus in regulatory regions will, by definition, be upwardly biased in datasets that are a priori evolutionarily conserved. Here we investigate the fitness effects of regulatory mutations using two complementary datasets of human TFBSs that are likely to be relatively free of ascertainment bias with respect to evolutionary conservation but, importantly, are supported by experimental data. The first is a collection of almost >2,100 human TFBSs drawn from the literature in the TRANSFAC database, and the second is derived from several recent high-throughput chromatin immunoprecipitation coupled with genomic microarray (ChIP-chip analyses. We also define a set of putative cis-regulatory modules (pCRMs by spatially clustering multiple TFBSs that regulate the same gene. We find that a relatively high proportion ( approximately 37% of mutations at TFBSs are strongly deleterious, similar to that at a 2-fold degenerate protein-coding site. However, constraint is significantly reduced in human and chimpanzee pCRMS and ChIP-chip sequences, relative to macaques. We estimate that the fraction of regulatory mutations that have been driven to fixation by positive selection in humans is not significantly different from zero. We also find

  10. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    Directory of Open Access Journals (Sweden)

    Claros M Gonzalo

    2010-06-01

    Full Text Available Abstract Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used

  11. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    Directory of Open Access Journals (Sweden)

    Roberts Richard J

    2008-05-01

    Full Text Available Abstract Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360, cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases.

  12. Linkage disequilibrium of evolutionarily conserved regions in the human genome

    Directory of Open Access Journals (Sweden)

    Johnson Todd A

    2006-12-01

    Full Text Available Abstract Background The strong linkage disequilibrium (LD recently found in genic or exonic regions of the human genome demonstrated that LD can be increased by evolutionary mechanisms that select for functionally important loci. This suggests that LD might be stronger in regions conserved among species than in non-conserved regions, since regions exposed to natural selection tend to be conserved. To assess this hypothesis, we used genome-wide polymorphism data from the HapMap project and investigated LD within DNA sequences conserved between the human and mouse genomes. Results Unexpectedly, we observed that LD was significantly weaker in conserved regions than in non-conserved regions. To investigate why, we examined sequence features that may distort the relationship between LD and conserved regions. We found that interspersed repeats, and not other sequence features, were associated with the weak LD tendency in conserved regions. To appropriately understand the relationship between LD and conserved regions, we removed the effect of repetitive elements and found that the high degree of sequence conservation was strongly associated with strong LD in coding regions but not with that in non-coding regions. Conclusion Our work demonstrates that the degree of sequence conservation does not simply increase LD as predicted by the hypothesis. Rather, it implies that purifying selection changes the polymorphic patterns of coding sequences but has little influence on the patterns of functional units such as regulatory elements present in non-coding regions, since the former are generally restricted by the constraint of maintaining a functional protein product across multiple exons while the latter may exist more as individually isolated units.

  13. Enhanced regulatory sequence prediction using gapped k-mer features.

    Science.gov (United States)

    Ghandi, Mahmoud; Lee, Dongwon; Mohammad-Noori, Morteza; Beer, Michael A

    2014-07-01

    Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.

  14. Enhanced regulatory sequence prediction using gapped k-mer features.

    Directory of Open Access Journals (Sweden)

    Mahmoud Ghandi

    2014-07-01

    Full Text Available Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.

  15. CodonLogo: a sequence logo-based viewer for codon patterns.

    Science.gov (United States)

    Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

    2012-07-15

    Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.

  16. Conserved regulatory modules in the Sox9 testis-specific enhancer predict roles for SOX, TCF/LEF, Forkhead, DMRT, and GATA proteins in vertebrate sex determination.

    Science.gov (United States)

    Bagheri-Fam, Stefan; Sinclair, Andrew H; Koopman, Peter; Harley, Vincent R

    2010-03-01

    While the primary sex determining switch varies between vertebrate species, a key downstream event in testicular development, namely the male-specific up-regulation of Sox9, is conserved. To date, only two sex determining switch genes have been identified, Sry in mammals and the Dmrt1-related gene Dmy (Dmrt1bY) in the medaka fish Oryzias latipes. In mice, Sox9 expression is evidently up-regulated by SRY and maintained by SOX9 both of which directly activate the core 1.3 kb testis-specific enhancer of Sox9 (TESCO). How Sox9 expression is up-regulated and maintained in species without Sry (i.e. non-mammalian species) is not understood. In this study, we have undertaken an in-depth comparative genomics approach and show that TESCO contains an evolutionarily conserved region (ECR) of 180 bp which is present in marsupials, monotremes, birds, reptiles and amphibians. The ECR contains highly conserved modules that predict regulatory roles for SOX, TCF/LEF, Forkhead, DMRT, and GATA proteins in vertebrate sex determination/differentiation. Our data suggest that tetrapods share common aspects of Sox9 regulation in the testis, despite having different sex determining switch mechanisms. They also suggest that Sox9 autoregulation is an ancient mechanism shared by all tetrapods, raising the possibility that in mammals, SRY evolved by mimicking this regulation. The validation of ECR regulatory sequences conserved from human to frogs will provide new insights into vertebrate sex determination. Copyright 2009 Elsevier Ltd. All rights reserved.

  17. Correlation between sequence conservation and structural thermodynamics of microRNA precursors from human, mouse, and chicken genomes

    Directory of Open Access Journals (Sweden)

    Wang Shengqi

    2010-10-01

    Full Text Available Abstract Background Previous studies have shown that microRNA precursors (pre-miRNAs have considerably more stable secondary structures than other native RNAs (tRNA, rRNA, and mRNA and artificial RNA sequences. However, pre-miRNAs with ultra stable secondary structures have not been investigated. It is not known if there is a tendency in pre-miRNA sequences towards or against ultra stable structures? Furthermore, the relationship between the structural thermodynamic stability of pre-miRNA and their evolution remains unclear. Results We investigated the correlation between pre-miRNA sequence conservation and structural stability as measured by adjusted minimum folding free energies in pre-miRNAs isolated from human, mouse, and chicken. The analysis revealed that conserved and non-conserved pre-miRNA sequences had structures with similar average stabilities. However, the relatively ultra stable and unstable pre-miRNAs were more likely to be non-conserved than pre-miRNAs with moderate stability. Non-conserved pre-miRNAs had more G+C than A+U nucleotides, while conserved pre-miRNAs contained more A+U nucleotides. Notably, the U content of conserved pre-miRNAs was especially higher than that of non-conserved pre-miRNAs. Further investigations showed that conserved and non-conserved pre-miRNAs exhibited different structural element features, even though they had comparable levels of stability. Conclusions We proposed that there is a correlation between structural thermodynamic stability and sequence conservation for pre-miRNAs from human, mouse, and chicken genomes. Our analyses suggested that pre-miRNAs with relatively ultra stable or unstable structures were less favoured by natural selection than those with moderately stable structures. Comparison of nucleotide compositions between non-conserved and conserved pre-miRNAs indicated the importance of U nucleotides in the pre-miRNA evolutionary process. Several characteristic structural elements were

  18. Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

    Directory of Open Access Journals (Sweden)

    Alicia R Martin

    2014-08-01

    Full Text Available Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP. The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and

  19. Remarkable sequence conservation of the last intron in the PKD1 gene.

    Science.gov (United States)

    Rodova, Marianna; Islam, M Rafiq; Peterson, Kenneth R; Calvet, James P

    2003-10-01

    The last intron of the PKD1 gene (intron 45) was found to have exceptionally high sequence conservation across four mammalian species: human, mouse, rat, and dog. This conservation did not extend to the comparable intron in pufferfish. Pairwise comparisons for intron 45 showed 91% identity (human vs. dog) to 100% identity (mouse vs. rat) for an average for all four species of 94% identity. In contrast, introns 43 and 44 of the PKD1 gene had average pairwise identities of 57% and 54%, and exons 43, 44, and 45 and the coding region of exon 46 had average pairwise identities of 80%, 84%, 82%, and 80%. Intron 45 is 90 to 95 bp in length, with the major region of sequence divergence being in a central 4-bp to 9-bp variable region. RNA secondary structure analysis of intron 45 predicts a branching stem-loop structure in which the central variable region lies in one loop and the putative branch point sequence lies in another loop, suggesting that the intron adopts a specific stem-loop structure that may be important for its removal. Although intron 45 appears to conform to the class of small, G-triplet-containing introns that are spliced by a mechanism utilizing intron definition, its high sequence conservation may be a reflection of constraints imposed by a unique mechanism that coordinates splicing of this last PKD1 intron with polyadenylation.

  20. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  1. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12.

    Science.gov (United States)

    Thieffry, D; Salgado, H; Huerta, A M; Collado-Vides, J

    1998-06-01

    As one of the best-characterized free-living organisms, Escherichia coli and its recently completed genomic sequence offer a special opportunity to exploit systematically the variety of regulatory data available in the literature in order to make a comprehensive set of regulatory predictions in the whole genome. The complete genome sequence of E.coli was analyzed for the binding of transcriptional regulators upstream of coding sequences. The biological information contained in RegulonDB (Huerta, A.M. et al., Nucleic Acids Res.,26,55-60, 1998) for 56 different transcriptional proteins was the support to implement a stringent strategy combining string search and weight matrices. We estimate that our search included representatives of 15-25% of the total number of regulatory binding proteins in E.coli. This search was performed on the set of 4288 putative regulatory regions, each 450 bp long. Within the regions with predicted sites, 89% are regulated by one protein and 81% involve only one site. These numbers are reasonably consistent with the distribution of experimental regulatory sites. Regulatory sites are found in 603 regions corresponding to 16% of operon regions and 10% of intra-operonic regions. Additional evidence gives stronger support to some of these predictions, including the position of the site, biological consistency with the function of the downstream gene, as well as genetic evidence for the regulatory interaction. The predictions described here were incorporated into the map presented in the paper describing the complete E.coli genome (Blattner,F.R. et al., Science, 277, 1453-1461, 1997). The complete set of predictions in GenBank format is available at the url: http://www. cifn.unam.mx/Computational_Biology/E.coli-predictions ecoli-reg@cifn.unam.mx, collado@cifn.unam.mx

  2. Extreme sequence divergence but conserved ligand-binding specificity in Streptococcus pyogenes M protein.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Many pathogenic microorganisms evade host immunity through extensive sequence variability in a protein region targeted by protective antibodies. In spite of the sequence variability, a variable region commonly retains an important ligand-binding function, reflected in the presence of a highly conserved sequence motif. Here, we analyze the limits of sequence divergence in a ligand-binding region by characterizing the hypervariable region (HVR of Streptococcus pyogenes M protein. Our studies were focused on HVRs that bind the human complement regulator C4b-binding protein (C4BP, a ligand that confers phagocytosis resistance. A previous comparison of C4BP-binding HVRs identified residue identities that could be part of a binding motif, but the extended analysis reported here shows that no residue identities remain when additional C4BP-binding HVRs are included. Characterization of the HVR in the M22 protein indicated that two relatively conserved Leu residues are essential for C4BP binding, but these residues are probably core residues in a coiled-coil, implying that they do not directly contribute to binding. In contrast, substitution of either of two relatively conserved Glu residues, predicted to be solvent-exposed, had no effect on C4BP binding, although each of these changes had a major effect on the antigenic properties of the HVR. Together, these findings show that HVRs of M proteins have an extraordinary capacity for sequence divergence and antigenic variability while retaining a specific ligand-binding function.

  3. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Ouyang Shu

    2005-09-01

    Full Text Available Abstract Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs, 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana, were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.

  4. The identification and functional annotation of RNA structures conserved in vertebrates

    DEFF Research Database (Denmark)

    Seemann, Ernst Stefan; Mirza, Aashiq Hussain; Hansen, Claus

    2017-01-01

    Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for Conserved RNA Structures (CRSs), leveraging structure-b......-structured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality.......Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for Conserved RNA Structures (CRSs), leveraging structure......-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ~516k human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (i) co-localize consistently with binding sites of the same RNA binding proteins...

  5. Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes

    Directory of Open Access Journals (Sweden)

    Kistler Corby

    2010-03-01

    Full Text Available Abstract Background Fusarium graminearum (Fg, a major fungal pathogen of cultivated cereals, is responsible for billions of dollars in agriculture losses. There is a growing interest in understanding the transcriptional regulation of this organism, especially the regulation of genes underlying its pathogenicity. The generation of whole genome sequence assemblies for Fg and three closely related Fusarium species provides a unique opportunity for such a study. Results Applying comparative genomics approaches, we developed a computational pipeline to systematically discover evolutionarily conserved regulatory motifs in the promoter, downstream and the intronic regions of Fg genes, based on the multiple alignments of sequenced Fusarium genomes. Using this method, we discovered 73 candidate regulatory motifs in the promoter regions. Nearly 30% of these motifs are highly enriched in promoter regions of Fg genes that are associated with a specific functional category. Through comparison to Saccharomyces cerevisiae (Sc and Schizosaccharomyces pombe (Sp, we observed conservation of transcription factors (TFs, their binding sites and the target genes regulated by these TFs related to pathways known to respond to stress conditions or phosphate metabolism. In addition, this study revealed 69 and 39 conserved motifs in the downstream regions and the intronic regions, respectively, of Fg genes. The top intronic motif is the splice donor site. For the downstream regions, we noticed an intriguing absence of the mammalian and Sc poly-adenylation signals among the list of conserved motifs. Conclusion This study provides the first comprehensive list of candidate regulatory motifs in Fg, and underscores the power of comparative genomics in revealing functional elements among related genomes. The conservation of regulatory pathways among the Fusarium genomes and the two yeast species reveals their functional significance, and provides new insights in their

  6. Comparative Bioinformatics Analysis of Transcription Factor Genes Indicates Conservation of Key Regulatory Domains among Babesia bovis, Babesia microti, and Theileria equi.

    Science.gov (United States)

    Alzan, Heba F; Knowles, Donald P; Suarez, Carlos E

    2016-11-01

    Apicomplexa tick-borne hemoparasites, including Babesia bovis, Babesia microti, and Theileria equi are responsible for bovine and human babesiosis and equine theileriosis, respectively. These parasites of vast medical, epidemiological, and economic impact have complex life cycles in their vertebrate and tick hosts. Large gaps in knowledge concerning the mechanisms used by these parasites for gene regulation remain. Regulatory genes coding for DNA binding proteins such as members of the Api-AP2, HMG, and Myb families are known to play crucial roles as transcription factors. Although the repertoire of Api-AP2 has been defined and a HMG gene was previously identified in the B. bovis genome, these regulatory genes have not been described in detail in B. microti and T. equi. In this study, comparative bioinformatics was used to: (i) identify and map genes encoding for these transcription factors among three parasites' genomes; (ii) identify a previously unreported HMG gene in B. microti; (iii) define a repertoire of eight conserved Myb genes; and (iv) identify AP2 correlates among B. bovis and the better-studied Plasmodium parasites. Searching the available transcriptome of B. bovis defined patterns of transcription of these three gene families in B. bovis erythrocyte stage parasites. Sequence comparisons show conservation of functional domains and general architecture in the AP2, Myb, and HMG proteins, which may be significant for the regulation of common critical parasite life cycle transitions in B. bovis, B. microti, and T. equi. A detailed understanding of the role of gene families encoding DNA binding proteins will provide new tools for unraveling regulatory mechanisms involved in B. bovis, B. microti, and T. equi life cycles and environmental adaptive responses and potentially contributes to the development of novel convergent strategies for improved control of babesiosis and equine piroplasmosis.

  7. Comparative Bioinformatics Analysis of Transcription Factor Genes Indicates Conservation of Key Regulatory Domains among Babesia bovis, Babesia microti, and Theileria equi.

    Directory of Open Access Journals (Sweden)

    Heba F Alzan

    2016-11-01

    Full Text Available Apicomplexa tick-borne hemoparasites, including Babesia bovis, Babesia microti, and Theileria equi are responsible for bovine and human babesiosis and equine theileriosis, respectively. These parasites of vast medical, epidemiological, and economic impact have complex life cycles in their vertebrate and tick hosts. Large gaps in knowledge concerning the mechanisms used by these parasites for gene regulation remain. Regulatory genes coding for DNA binding proteins such as members of the Api-AP2, HMG, and Myb families are known to play crucial roles as transcription factors. Although the repertoire of Api-AP2 has been defined and a HMG gene was previously identified in the B. bovis genome, these regulatory genes have not been described in detail in B. microti and T. equi. In this study, comparative bioinformatics was used to: (i identify and map genes encoding for these transcription factors among three parasites' genomes; (ii identify a previously unreported HMG gene in B. microti; (iii define a repertoire of eight conserved Myb genes; and (iv identify AP2 correlates among B. bovis and the better-studied Plasmodium parasites. Searching the available transcriptome of B. bovis defined patterns of transcription of these three gene families in B. bovis erythrocyte stage parasites. Sequence comparisons show conservation of functional domains and general architecture in the AP2, Myb, and HMG proteins, which may be significant for the regulation of common critical parasite life cycle transitions in B. bovis, B. microti, and T. equi. A detailed understanding of the role of gene families encoding DNA binding proteins will provide new tools for unraveling regulatory mechanisms involved in B. bovis, B. microti, and T. equi life cycles and environmental adaptive responses and potentially contributes to the development of novel convergent strategies for improved control of babesiosis and equine piroplasmosis.

  8. Implications of duplicated cis-regulatory elements in the evolution of metazoans: the DDI model or how simplicity begets novelty.

    Science.gov (United States)

    Jiménez-Delgado, Senda; Pascual-Anaya, Juan; Garcia-Fernàndez, Jordi

    2009-07-01

    The discovery that most regulatory genes were conserved among animals from distant phyla challenged the ideas that gene duplication and divergence of homologous coding sequences were the basis for major morphological changes in metazoan evolution. In recent years, however, the interest for the roles, conservation and changes of non-coding sequences grew-up in parallel with genome sequencing projects. Presently, many independent studies are highlighting the importance that subtle changes in cis-regulatory regions had in the evolution of morphology trough the Animal Kingdom. Here we will show and discuss some of these studies, and underscore the future of cis-Evo-Devo research. Nevertheless, we would also explore how gene duplication, which includes duplication of regulatory regions, may have been critical for spatial or temporal co-option of new regulatory networks, causing the deployment of new transcriptome scenarios, and how these induced morphological changes were critical for the evolution of new forms. Forty years after Susumu Ohno famous sentence 'natural selection merely modifies, while redundancy creates', we suggest the alternative: 'natural selection modifies, while redundancy of cis-regulatory elements innovates', and propose the Duplication-Degeneration-Innovation model to explain the increased evolvability of duplicated cis-regulatory regions. Paradoxically, making regulation simpler by subfunctionalization paved the path for future complexity or, in other words, 'to make it simple to make it complex'.

  9. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

    Science.gov (United States)

    Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

    2010-12-15

    Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  10. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi

    Directory of Open Access Journals (Sweden)

    Huynen Leon

    2010-12-01

    Full Text Available Abstract Background Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Results Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. Conclusions The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  11. Metazoan Remaining Genes for Essential Amino Acid Biosynthesis: Sequence Conservation and Evolutionary Analyses

    Directory of Open Access Journals (Sweden)

    Igor R. Costa

    2014-12-01

    Full Text Available Essential amino acids (EAA consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS and betaine-homocysteine S-methyltransferase (BHMT diverged from the expected Tree of Life (ToL relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  12. Conserved-peptide upstream open reading frames (CPuORFs are associated with regulatory genes in angiosperms

    Directory of Open Access Journals (Sweden)

    Richard A Jorgensen

    2012-08-01

    Full Text Available Upstream open reading frames (uORFs are common in eukaryotic transcripts, but those that encode conserved peptides (CPuORFs occur in less than 1% of transcripts. The peptides encoded by three plant CPuORF families are known to control translation of the downstream ORF in response to a small signal molecule (sucrose, polyamines and phosphocholine. In flowering plants, transcription factors are statistically over-represented among genes that possess CPuORFs, and in general it appeared that many CPuORF genes also had other regulatory functions, though the significance of this suggestion was uncertain (Hayden and Jorgensen, 2007. Five years later the literature provides much more information on the functions of many CPuORF genes. Here we reassess the functions of 27 known CPuORF gene families and find that 22 of these families play a variety of different regulatory roles, from transcriptional control to protein turnover, and from small signal molecules to signal transduction kinases. Clearly then, there is indeed a strong association of CPuORFs with regulatory genes. In addition, 16 of these families play key roles in a variety of different biological processes. Most strikingly, the core sucrose response network includes three different CPuORFs, creating the potential for sophisticated balancing of the network in response to three different molecular inputs. We propose that the function of most CPuORFs is to modulate translation of a downstream major ORF (mORF in response to a signal molecule recognized by the conserved peptide and that because the mORFs of CPuORF genes generally encode regulatory proteins, many of them centrally important in the biology of plants, CPuORFs play key roles in balancing such regulatory networks.

  13. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    Energy Technology Data Exchange (ETDEWEB)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.; Salzberg, Steven L.; Rubin, Gerald M.; Eisen, Michael B.; Celniker, SusanE.

    2004-08-06

    The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.

  14. KIRMES: kernel-based identification of regulatory modules in euchromatic sequences.

    Science.gov (United States)

    Schultheiss, Sebastian J; Busch, Wolfgang; Lohmann, Jan U; Kohlbacher, Oliver; Rätsch, Gunnar

    2009-08-15

    Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules. We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets. The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at http://www.fml.mpg.de/raetsch/suppl/kirmes/.

  15. HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

    Directory of Open Access Journals (Sweden)

    Charles Richard Bradshaw

    Full Text Available Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10, a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in

  16. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

    Science.gov (United States)

    Sharmin, Refat; Islam, Abul B M M K

    2016-01-01

    MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

  17. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera.

    Directory of Open Access Journals (Sweden)

    Hui Wang

    Full Text Available We sequenced small (s RNAs from field collected honeybees (Apis mellifera and bumblebees (Bombuspascuorum using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1 and Deformed wing virus (DWV genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences and within-population (dataset of this study levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10% were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  18. Relationships between residue Voronoi volume and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung

    2018-02-01

    Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    Science.gov (United States)

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  20. FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

    Science.gov (United States)

    Wilson, Carolyn A; Simonyan, Vahan

    2014-01-01

    Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and

  1. SRD: a Staphylococcus regulatory RNA database.

    Science.gov (United States)

    Sassi, Mohamed; Augagneur, Yoann; Mauro, Tony; Ivain, Lorraine; Chabelskaya, Svetlana; Hallier, Marc; Sallou, Olivier; Felden, Brice

    2015-05-01

    An overflow of regulatory RNAs (sRNAs) was identified in a wide range of bacteria. We designed and implemented a new resource for the hundreds of sRNAs identified in Staphylococci, with primary focus on the human pathogen Staphylococcus aureus. The "Staphylococcal Regulatory RNA Database" (SRD, http://srd.genouest.org/) compiled all published data in a single interface including genetic locations, sequences and other features. SRD proposes novel and simplified identifiers for Staphylococcal regulatory RNAs (srn) based on the sRNA's genetic location in S. aureus strain N315 which served as a reference. From a set of 894 sequences and after an in-depth cleaning, SRD provides a list of 575 srn exempt of redundant sequences. For each sRNA, their experimental support(s) is provided, allowing the user to individually assess their validity and significance. RNA-seq analysis performed on strains N315, NCTC8325, and Newman allowed us to provide further details, upgrade the initial annotation, and identified 159 RNA-seq independent transcribed sRNAs. The lists of 575 and 159 sRNAs sequences were used to predict the number and location of srns in 18 S. aureus strains and 10 other Staphylococci. A comparison of the srn contents within 32 Staphylococcal genomes revealed a poor conservation between species. In addition, sRNA structure predictions obtained with MFold are accessible. A BLAST server and the intaRNA program, which is dedicated to target prediction, were implemented. SRD is the first sRNA database centered on a genus; it is a user-friendly and scalable device with the possibility to submit new sequences that should spread in the literature. © 2015 Sassi et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  2. Comparative analysis of regulatory elements between Escherichia coli and Klebsiella pneumoniae by genome-wide transcription start site profiling.

    Directory of Open Access Journals (Sweden)

    Donghyuk Kim

    Full Text Available Genome-wide transcription start site (TSS profiles of the enterobacteria Escherichia coli and Klebsiella pneumoniae were experimentally determined through modified 5' RACE followed by deep sequencing of intact primary mRNA. This identified 3,746 and 3,143 TSSs for E. coli and K. pneumoniae, respectively. Experimentally determined TSSs were then used to define promoter regions and 5' UTRs upstream of coding genes. Comparative analysis of these regulatory elements revealed the use of multiple TSSs, identical sequence motifs of promoter and Shine-Dalgarno sequence, reflecting conserved gene expression apparatuses between the two species. In both species, over 70% of primary transcripts were expressed from operons having orthologous genes during exponential growth. However, expressed orthologous genes in E. coli and K. pneumoniae showed a strikingly different organization of upstream regulatory regions with only 20% identical promoters with TSSs in both species. Over 40% of promoters had TSSs identified in only one species, despite conserved promoter sequences existing in the other species. 662 conserved promoters having TSSs in both species resulted in the same number of comparable 5' UTR pairs, and that regulatory element was found to be the most variant region in sequence among promoter, 5' UTR, and ORF. In K. pneumoniae, 48 sRNAs were predicted and 36 of them were expressed during exponential growth. Among them, 34 orthologous sRNAs between two species were analyzed in depth, and the analysis showed that many sRNAs of K. pneumoniae, including pleiotropic sRNAs such as rprA, arcZ, and sgrS, may work in the same way as in E. coli. These results reveal a new dimension of comparative genomics such that a comparison of two genomes needs to be comprehensive over all levels of genome organization.

  3. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    Science.gov (United States)

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  4. Phylogenetic analysis reveals conservation and diversification of micro RNA166 genes among diverse plant species.

    Science.gov (United States)

    Barik, Suvakanta; SarkarDas, Shabari; Singh, Archita; Gautam, Vibhav; Kumar, Pramod; Majee, Manoj; Sarkar, Ananda K

    2014-01-01

    Similar to the majority of the microRNAs, mature miR166s are derived from multiple members of MIR166 genes (precursors) and regulate various aspects of plant development by negatively regulating their target genes (Class III HD-ZIP). The evolutionary conservation or functional diversification of miRNA166 family members remains elusive. Here, we show the phylogenetic relationships among MIR166 precursor and mature sequences from three diverse model plant species. Despite strong conservation, some mature miR166 sequences, such as ppt-miR166m, have undergone sequence variation. Critical sequence variation in ppt-miR166m has led to functional diversification, as it targets non-HD-ZIPIII gene transcript (s). MIR166 precursor sequences have diverged in a lineage specific manner, and both precursors and mature osa-miR166i/j are highly conserved. Interestingly, polycistronic MIR166s were present in Physcomitrella and Oryza but not in Arabidopsis. The nature of cis-regulatory motifs on the upstream promoter sequences of MIR166 genes indicates their possible contribution to the functional variation observed among miR166 species. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Sequence and structural analysis of the chitinase insertion domain reveals two conserved motifs involved in chitin-binding.

    Directory of Open Access Journals (Sweden)

    Hai Li

    2010-01-01

    Full Text Available Chitinases are prevalent in life and are found in species including archaea, bacteria, fungi, plants, and animals. They break down chitin, which is the second most abundant carbohydrate in nature after cellulose. Hence, they are important for maintaining a balance between carbon and nitrogen trapped as insoluble chitin in biomass. Chitinases are classified into two families, 18 and 19 glycoside hydrolases. In addition to a catalytic domain, which is a triosephosphate isomerase barrel, many family 18 chitinases contain another module, i.e., chitinase insertion domain. While numerous studies focus on the biological role of the catalytic domain in chitinase activity, the function of the chitinase insertion domain is not completely understood. Bioinformatics offers an important avenue in which to facilitate understanding the role of residues within the chitinase insertion domain in chitinase function.Twenty-seven chitinase insertion domain sequences, which include four experimentally determined structures and span five kingdoms, were aligned and analyzed using a modified sequence entropy parameter. Thirty-two positions with conserved residues were identified. The role of these conserved residues was explored by conducting a structural analysis of a number of holo-enzymes. Hydrogen bonding and van der Waals calculations revealed a distinct subset of four conserved residues constituting two sequence motifs that interact with oligosaccharides. The other conserved residues may be key to the structure, folding, and stability of this domain.Sequence and structural studies of the chitinase insertion domains conducted within the framework of evolution identified four conserved residues which clearly interact with the substrates. Furthermore, evolutionary studies propose a link between the appearance of the chitinase insertion domain and the function of family 18 chitinases in the subfamily A.

  6. A unique genomic sequence in the Wolf-Hirschhorn syndrome [WHS] region of humans is conserved in the great apes.

    Science.gov (United States)

    Tarzami, S T; Kringstein, A M; Conte, R A; Verma, R S

    1996-10-01

    The Wolf-Hirschhorn syndrome (WHS) is caused by a partial deletion in the short arm of chromosome 4 band 16.3 (4p 16.3). A unique-sequence human DNA probe (39 kb) localized within this region has been used to search for sequence homology in the apes' equivalent chromosome 3 by FISH-technique. The WHS loci are conserved in higher primates at the expected position. Nevertheless, a control probe, which detects alphoid sequences of the pericentromeric region of humans, is diverged in chimpanzee, gorilla, and orangutan. The conservation of WHS loci and divergence of DNA alphoid sequences have further added to the controversy concerning human descent.

  7. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  8. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    Science.gov (United States)

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  9. A Simple Predictive Enhancer Syntax for Hindbrain Patterning Is Conserved in Vertebrate Genomes.

    Directory of Open Access Journals (Sweden)

    Joseph Grice

    Full Text Available Determining the function of regulatory elements is fundamental for our understanding of development, disease and evolution. However, the sequence features that mediate these functions are often unclear and the prediction of tissue-specific expression patterns from sequence alone is non-trivial. Previous functional studies have demonstrated a link between PBX-HOX and MEIS/PREP binding interactions and hindbrain enhancer activity, but the defining grammar of these sites, if any exists, has remained elusive.Here, we identify a shared sequence signature (syntax within a heterogeneous set of conserved vertebrate hindbrain enhancers composed of spatially co-occurring PBX-HOX and MEIS/PREP transcription factor binding motifs. We use this syntax to accurately predict hindbrain enhancers in 89% of cases (67/75 predicted elements from a set of conserved non-coding elements (CNEs. Furthermore, mutagenesis of the sites abolishes activity or generates ectopic expression, demonstrating their requirement for segmentally restricted enhancer activity in the hindbrain. We refine and use our syntax to predict over 3,000 hindbrain enhancers across the human genome. These sequences tend to be located near developmental transcription factors and are enriched in known hindbrain activating elements, demonstrating the predictive power of this simple model.Our findings support the theory that hundreds of CNEs, and perhaps thousands of regions across the human genome, function to coordinate gene expression in the developing hindbrain. We speculate that deeply conserved sequences of this kind contributed to the co-option of new genes into the hindbrain gene regulatory network during early vertebrate evolution by linking patterns of hox expression to downstream genes involved in segmentation and patterning, and evolutionarily newer instances may have continued to contribute to lineage-specific elaboration of the hindbrain.

  10. Regulatory interventions necessitated by non-conservative operator decisions

    International Nuclear Information System (INIS)

    Ojha, D.; Chande, S.K.; Sharma, S.K.

    2005-01-01

    Presently, India has 15 nuclear power units in operation and 8 units under construction. Though the safety performance of the Nuclear Power Plants (NPPs) in India has been excellent, a few recent events indicate that conservative decision making process can possibly get affected by perceived production goals. In one of the events, a need for some maintenance work arose while reactor start-up was in progress. After it was realized that the maintenance would require considerable time, the proper course of action would have been to shutdown the reactor and add neutron poison to moderator to ensure sufficient sub criticality. This was not done, as it would have delayed the start up of the reactor on completion of maintenance. In another incident, an unintended slow increase in reactor power occurred due to the adjuster rods becoming inoperable on account of blowing-off of fuses in their power supply. Under this condition, the reactor should have been tripped which was not done. Further, the automatic addition of boron poison to the reactor was inhibited. Regulatory review showed that both the incidents were indicative of degradation in safety culture and reflected operator's overriding concern for keeping the units in operation. Appropriate corrective actions were taken to prevent recurrence of such events in the respective units as also in all other operating units of similar type. In the wake of improved production performance operators may develop a tendency to create new operational records and compete with other units. This aspect points out to a need for careful study of events to check the presence of any element of non-conservative decision-making and to identify leading indicators of degradation in safety performance. (author)

  11. The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.

    Science.gov (United States)

    Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir

    2015-08-06

    Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for

  12. High throughput sequencing of small RNA component of leaves and inflorescence revealed conserved and novel miRNAs as well as phasiRNA loci in chickpea.

    Science.gov (United States)

    Srivastava, Sangeeta; Zheng, Yun; Kudapa, Himabindu; Jagadeeswaran, Guru; Hivrale, Vandana; Varshney, Rajeev K; Sunkar, Ramanjulu

    2015-06-01

    Among legumes, chickpea (Cicer arietinum L.) is the second most important crop after soybean. MicroRNAs (miRNAs) play important roles by regulating target gene expression important for plant development and tolerance to stress conditions. Additionally, recently discovered phased siRNAs (phasiRNAs), a new class of small RNAs, are abundantly produced in legumes. Nevertheless, little is known about these regulatory molecules in chickpea. The small RNA population was sequenced from leaves and flowers of chickpea to identify conserved and novel miRNAs as well as phasiRNAs/phasiRNA loci. Bioinformatics analysis revealed 157 miRNA loci for the 96 highly conserved and known miRNA homologs belonging to 38 miRNA families in chickpea. Furthermore, 20 novel miRNAs belonging to 17 miRNA families were identified. Sequence analysis revealed approximately 60 phasiRNA loci. Potential target genes likely to be regulated by these miRNAs were predicted and some were confirmed by modified 5' RACE assay. Predicted targets are mostly transcription factors that might be important for developmental processes, and others include superoxide dismutases, plantacyanin, laccases and F-box proteins that could participate in stress responses and protein degradation. Overall, this study provides an inventory of miRNA-target gene interactions for chickpea, useful for the comparative analysis of small RNAs among legumes. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  13. Solexa sequencing identification of conserved and novel microRNAs in backfat of Large White and Chinese Meishan pigs.

    Directory of Open Access Journals (Sweden)

    Chen Chen

    Full Text Available The domestic pig (Sus scrofa, an important species in animal production industry, is a right model for studying adipogenesis and fat deposition. In order to expand the repertoire of porcine miRNAs and further explore potential regulatory miRNAs which have influence on adipogenesis, high-throughput Solexa sequencing approach was adopted to identify miRNAs in backfat of Large White (lean type pig and Meishan pigs (Chinese indigenous fatty pig. We identified 215 unique miRNAs comprising 75 known pre-miRNAs, of which 49 miRNA*s were first identified in our study, 73 miRNAs were overlapped in both libraries, and 140 were novelly predicted miRNAs, and 215 unique miRNAs were collectively corresponding to 235 independent genomic loci. Furthermore, we analyzed the sequence variations, seed edits and phylogenetic development of the miRNAs. 17 miRNAs were widely conserved from vertebrates to invertebrates, suggesting that these miRNAs may serve as potential evolutional biomarkers. 9 conserved miRNAs with significantly differential expressions were determined. The expression of miR-215, miR-135, miR-224 and miR-146b was higher in Large White pigs, opposite to the patterns shown by miR-1a, miR-133a, miR-122, miR-204 and miR-183. Almost all novel miRNAs could be considered pig-specific except ssc-miR-1343, miR-2320, miR-2326, miR-2411 and miR-2483 which had homologs in Bos taurus, among which ssc-miR-1343, miR-2320, miR-2411 and miR-2483 were validated in backfat tissue by stem-loop qPCR. Our results displayed a high level of concordance between the qPCR and Solexa sequencing method in 9 of 10 miRNAs comparisons except for miR-1a. Moreover, we found 2 miRNAs, miR-135 and miR-183, may exert impacts on porcine backfat development through WNT signaling pathway. In conclusion, our research develops porcine miRNAs and should be beneficial to study the adipogenesis and fat deposition of different pig breeds based on miRNAs.

  14. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences

    OpenAIRE

    Lescot, Magali; Déhais, Patrice; Thijs, Gert; Marchal, Kathleen; Moreau, Yves; Van de Peer, Yves; Rouzé, Pierre; Rombauts, Stephane

    2002-01-01

    PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus sequences and individual sites on particular promoter sequences. Links to the EMBL, TRANSFAC and MEDLINE databases are provided when available. Data about the transcription sites are extracted mainly from the literature, supplemented with an increasing number of in silico predicted data. Apart from a general description for specific t...

  15. A conserved regulatory mechanism in bifunctional biotin protein ligases.

    Science.gov (United States)

    Wang, Jingheng; Beckett, Dorothy

    2017-08-01

    Class II bifunctional biotin protein ligases (BirA), which catalyze post-translational biotinylation and repress transcription initiation, are broadly distributed in eubacteria and archaea. However, it is unclear if these proteins all share the same molecular mechanism of transcription regulation. In Escherichia coli the corepressor biotinoyl-5'-AMP (bio-5'-AMP), which is also the intermediate in biotin transfer, promotes operator binding and resulting transcription repression by enhancing BirA dimerization. Like E. coli BirA (EcBirA), Staphylococcus aureus, and Bacillus subtilis BirA (Sa and BsBirA) repress transcription in vivo in a biotin-dependent manner. In this work, sedimentation equilibrium measurements were performed to investigate the molecular basis of this biotin-responsive transcription regulation. The results reveal that, as observed for EcBirA, Sa, and BsBirA dimerization reactions are significantly enhanced by bio-5'-AMP binding. Thus, the molecular mechanism of the Biotin Regulatory System is conserved in the biotin repressors from these three organisms. © 2017 The Protein Society.

  16. Conserved cis-regulatory regions in a large genomic landscape control SHH and BMP-regulated Gremlin1 expression in mouse limb buds

    Directory of Open Access Journals (Sweden)

    Zuniga Aimée

    2012-08-01

    Full Text Available Abstract Background Mouse limb bud is a prime model to study the regulatory interactions that control vertebrate organogenesis. Major aspects of limb bud development are controlled by feedback loops that define a self-regulatory signalling system. The SHH/GREM1/AER-FGF feedback loop forms the core of this signalling system that operates between the posterior mesenchymal organiser and the ectodermal signalling centre. The BMP antagonist Gremlin1 (GREM1 is a critical node in this system, whose dynamic expression is controlled by BMP, SHH, and FGF signalling and key to normal progression of limb bud development. Previous analysis identified a distant cis-regulatory landscape within the neighbouring Formin1 (Fmn1 locus that is required for Grem1 expression, reminiscent of the genomic landscapes controlling HoxD and Shh expression in limb buds. Results Three highly conserved regions (HMCO1-3 were identified within the previously defined critical genomic region and tested for their ability to regulate Grem1 expression in mouse limb buds. Using a combination of BAC and conventional transgenic approaches, a 9 kb region located ~70 kb downstream of the Grem1 transcription unit was identified. This region, termed Grem1 Regulatory Sequence 1 (GRS1, is able to recapitulate major aspects of Grem1 expression, as it drives expression of a LacZ reporter into the posterior and, to a lesser extent, in the distal-anterior mesenchyme. Crossing the GRS1 transgene into embryos with alterations in the SHH and BMP pathways established that GRS1 depends on SHH and is modulated by BMP signalling, i.e. integrates inputs from these pathways. Chromatin immunoprecipitation revealed interaction of endogenous GLI3 proteins with the core cis-regulatory elements in the GRS1 region. As GLI3 is a mediator of SHH signal transduction, these results indicated that SHH directly controls Grem1 expression through the GRS1 region. Finally, all cis-regulatory regions within the Grem1

  17. RNA expression in a cartilaginous fish cell line reveals ancient 3′ noncoding regions highly conserved in vertebrates

    Science.gov (United States)

    Forest, David; Nishikawa, Ryuhei; Kobayashi, Hiroshi; Parton, Angela; Bayne, Christopher J.; Barnes, David W.

    2007-01-01

    We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranchs (sharks and rays) first appeared >400 million years ago, and existing species provide useful models for comparative vertebrate cell biology, physiology, and genomics. Comparative vertebrate genomics among evolutionarily distant organisms can provide sequence conservation information that facilitates identification of critical coding and noncoding regions. Although these genomic analyses are informative, experimental verification of functions of genomic sequences depends heavily on cell culture approaches. Using ESTs defining mRNAs derived from the SAE cell line, we identified lengthy and highly conserved gene-specific nucleotide sequences in the noncoding 3′ UTRs of eight genes involved in the regulation of cell growth and proliferation. Conserved noncoding 3′ mRNA regions detected by using the shark nucleotide sequences as a starting point were found in a range of other vertebrate orders, including bony fish, birds, amphibians, and mammals. Nucleotide identity of shark and human in these regions was remarkably well conserved. Our results indicate that highly conserved gene sequences dating from the appearance of jawed vertebrates and representing potential cis-regulatory elements can be identified through the use of cartilaginous fish as a baseline. Because the expression of genes in the SAE cell line was prerequisite for their identification, this cartilaginous fish culture system also provides a physiologically valid tool to test functional hypotheses on the role of these ancient conserved sequences in comparative cell biology. PMID:17227856

  18. RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay.

    Science.gov (United States)

    Dean, Kimberly M; Grayhack, Elizabeth J

    2012-12-01

    We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression.

  19. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    Science.gov (United States)

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  20. How conserved are the conserved 16S-rRNA regions?

    Directory of Open Access Journals (Sweden)

    Marcel Martinez-Porchas

    2017-02-01

    Full Text Available The 16S rRNA gene has been used as master key for studying prokaryotic diversity in almost every environment. Despite the claim of several researchers to have the best universal primers, the reality is that no primer has been demonstrated to be truly universal. This suggests that conserved regions of the gene may not be as conserved as expected. The aim of this study was to evaluate the conservation degree of the so-called conserved regions flanking the hypervariable regions of the 16S rRNA gene. Data contained in SILVA database (release 123 were used for the study. Primers reported as matches of each conserved region were assembled to form contigs; sequences sizing 12 nucleotides (12-mers were extracted from these contigs and searched into the entire set of SILVA sequences. Frequency analysis shown that extreme regions, 1 and 10, registered the lowest frequencies. 12-mer frequencies revealed segments of contigs that were not as conserved as expected (≤90%. Fragments corresponding to the primer contigs 3, 4, 5b and 6a were recovered from all sequences in SILVA database. Nucleotide frequency analysis in each consensus demonstrated that only a small fraction of these so-called conserved regions is truly conserved in non-redundant sequences. It could be concluded that conserved regions of the 16S rRNA gene exhibit considerable variation that has to be considered when using this gene as biomarker.

  1. Analysis of the conservation of synteny between Fugu and human chromosome 12

    Directory of Open Access Journals (Sweden)

    Koop Ben F

    2003-07-01

    Full Text Available Abstract Background The pufferfish Fugu rubripes (Fugu with its compact genome is increasingly recognized as an important vertebrate model for comparative genomic studies. In particular, large regions of conserved synteny between human and Fugu genomes indicate its utility to identify disease-causing genes. The human chromosome 12p12 is frequently deleted in various hematological malignancies and solid tumors, but the actual tumor suppressor gene remains unidentified. Results We investigated approximately 200 kb of the genomic region surrounding the ETV6 locus in Fugu (fETV6 in order to find conserved functional features, such as genes or regulatory regions, that could give insight into the nature of the genes targeted by deletions in human cancer cells. Seven genes were identified near the fETV6 locus. We found that the synteny with human chromosome 12 was conserved, but extensive genomic rearrangements occurred between the Fugu and human ETV6 loci. Conclusion This comparative analysis led to the identification of previously uncharacterized genes in the human genome and some potentially important regulatory sequences as well. This is a good indication that the analysis of the compact Fugu genome will be valuable to identify functional features that have been conserved throughout the evolution of vertebrates.

  2. A regulatory code for neuron-specific odor receptor expression.

    Directory of Open Access Journals (Sweden)

    Anandasankar Ray

    2008-05-01

    Full Text Available Olfactory receptor neurons (ORNs must select-from a large repertoire-which odor receptors to express. In Drosophila, most ORNs express one of 60 Or genes, and most Or genes are expressed in a single ORN class in a process that produces a stereotyped receptor-to-neuron map. The construction of this map poses a problem of receptor gene regulation that is remarkable in its dimension and about which little is known. By using a phylogenetic approach and the genome sequences of 12 Drosophila species, we systematically identified regulatory elements that are evolutionarily conserved and specific for individual Or genes of the maxillary palp. Genetic analysis of these elements supports a model in which each receptor gene contains a zip code, consisting of elements that act positively to promote expression in a subset of ORN classes, and elements that restrict expression to a single ORN class. We identified a transcription factor, Scalloped, that mediates repression. Some elements are used in other chemosensory organs, and some are conserved upstream of axon-guidance genes. Surprisingly, the odor response spectra and organization of maxillary palp ORNs have been extremely well-conserved for tens of millions of years, even though the amino acid sequences of the receptors are not highly conserved. These results, taken together, define the logic by which individual ORNs in the maxillary palp select which odor receptors to express.

  3. Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

    International Nuclear Information System (INIS)

    Westberg, Johan A.; Jiang, Ji; Andersson, Leif C.

    2011-01-01

    Highlights: → Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. → Central iron atom of heme and cysteine-114 of STC1 are essential for binding. → STC1 binds Fe 2+ and Fe 3+ heme. → STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys 114 as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H 2 O 2 induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.

  4. T-cell recognition is shaped by epitope sequence conservation in the host proteome and microbiome

    DEFF Research Database (Denmark)

    Bresciani, Anne Gøther; Paul, Sinu; Schommer, Nina

    2016-01-01

    or allergen with the conservation of its sequence in the human proteome or the healthy human microbiome. Indeed, performing such comparisons on large sets of validated T-cell epitopes, we found that epitopes that are similar with self-antigens above a certain threshold showed lower immunogenicity, presumably...... as a result of negative selection of T cells capable of recognizing such peptides. Moreover, we also found a reduced level of immune recognition for epitopes conserved in the commensal microbiome, presumably as a result of peripheral tolerance. These findings indicate that the existence (and potentially...

  5. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    Science.gov (United States)

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. An atlas of over 90.000 conserved noncoding sequences provides insight into crucifer regulatory regions

    NARCIS (Netherlands)

    Haudry, A.; Platts, A.E.; Vello, E.; Hoen, D.R.; Leclerq, M.; Williamson, R.J.; Forczek, E.; Joly-Lopez, Z.; Steffen, J.G.; Hazzouri, K.M.; Dewar, K.; Stinchcombe, J.R.; Schoen, D.J.; Wang, X.; Schmutz, J.; Town, C.D.; Edger, P.P.; Pires, J.C.; Schumaker, K.S.; Jarvis, D.E.; Mandakova, T.; Lysak, M.; Bergh, van den E.; Schranz, M.E.; Harrison, P.M.

    2013-01-01

    Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica,

  7. Characterization of Cer-1 cis-regulatory region during early Xenopus development.

    Science.gov (United States)

    Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

    2011-05-01

    Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.

  8. Production of recombinant AAV vectors encoding insulin-like growth factor I is enhanced by interaction among AAV rep regulatory sequences

    Directory of Open Access Journals (Sweden)

    Dilley Robert

    2009-01-01

    Full Text Available Abstract Background Adeno-associated virus (AAV vectors are promising tools for gene therapy. Currently, their potential is limited by difficulties in producing high vector yields with which to generate transgene protein product. AAV vector production depends in part upon the replication (Rep proteins required for viral replication. We tested the hypothesis that mutations in the start codon and upstream regulatory elements of Rep78/68 in AAV helper plasmids can regulate recombinant AAV (rAAV vector production. We further tested whether the resulting rAAV vector preparation augments the production of the potentially therapeutic transgene, insulin-like growth factor I (IGF-I. Results We constructed a series of AAV helper plasmids containing different Rep78/68 start codon in combination with different gene regulatory sequences. rAAV vectors carrying the human IGF-I gene were prepared with these vectors and the vector preparations used to transduce HT1080 target cells. We found that the substitution of ATG by ACG in the Rep78/68 start codon in an AAV helper plasmid (pAAV-RC eliminated Rep78/68 translation, rAAV and IGF-I production. Replacement of the heterologous sequence upstream of Rep78/68 in pAAV-RC with the AAV2 endogenous p5 promoter restored translational activity to the ACG mutant, and restored rAAV and IGF-I production. Insertion of the AAV2 p19 promoter sequence into pAAV-RC in front of the heterologous sequence also enabled ACG to function as a start codon for Rep78/68 translation. The data further indicate that the function of the AAV helper construct (pAAV-RC, that is in current widespread use for rAAV production, may be improved by replacement of its AAV2 unrelated heterologous sequence with the native AAV2 p5 promoter. Conclusion Taken together, the data demonstrate an interplay between the start codon and upstream regulatory sequences in the regulation of Rep78/68 and indicate that selective mutations in Rep78/68 regulatory elements

  9. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    Science.gov (United States)

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  10. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks.

    Science.gov (United States)

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A; Kellis, Manolis

    2012-07-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.

  11. A Potential Tool for Swift Fox (Vulpes velox) Conservation: Individuality of Long-Range Barking Sequences

    DEFF Research Database (Denmark)

    Darden, Safi-Kirstine Klem; Dabelsteen, Torben; Pedersen, Simon Boel

    2003-01-01

    Vocal individuality has been found in a number canid species. This natural variation can have applications in several aspects of species conservation, from behavioral studies to estimating population density or abundance. The swift fox (Vulpes velox) is a North American canid listed as endangered...... in Canada and extirpated, endangered, or threatened in parts of the United States. The barking sequence is a long-range vocalization in the species' vocal repertoire. It consists of a series of barks and is most common during the mating season. We analyzed barking sequences recorded in a standardized...

  12. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    Science.gov (United States)

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.

  13. Identification of new TSGA10 transcript variants in human testis with conserved regulatory RNA elements in 5'untranslated region and distinct expression in breast cancer.

    Science.gov (United States)

    Salehipour, Pouya; Nematzadeh, Mahsa; Mobasheri, Maryam Beigom; Afsharpad, Mandana; Mansouri, Kamran; Modarressi, Mohammad Hossein

    2017-09-01

    Testis specific gene antigen 10 (TSGA10) is a cancer testis antigen involved in the process of spermatogenesis. TSGA10 could also play an important role in the inhibition of angiogenesis by preventing nuclear localization of HIF-1α. Although it has been shown that TSGA10 messenger RNA (mRNA) is mainly expressed in testis and some tumors, the transcription pattern and regulatory mechanisms of this gene remain largely unknown. Here, we report that human TSGA10 comprises at least 22 exons and generates four different transcript variants. It was identified that using two distinct promoters and splicing of exons 4 and 7 produced these transcript variants, which have the same coding sequence, but the sequence of 5'untanslated region (5'UTR) is different between them. This is significant because conserved regulatory RNA elements like upstream open reading frame (uORF) and putative internal ribosome entry site (IRES) were found in this region which have different combinations in each transcript variant and it may influence translational efficiency of them in normal or unusual environmental conditions like hypoxia. To indicate the transcription pattern of TSGA10 in breast cancer, expression of identified transcript variants was analyzed in 62 breast cancer samples. We found that TSGA10 tends to express variants with shorter 5'UTR and fewer uORF elements in breast cancer tissues. Our study demonstrates for the first time the expression of different TSGA10 transcript variants in testis and breast cancer tissues and provides a first clue to a role of TSGA10 5'UTR in regulation of translation in unusual environmental conditions like hypoxia. Copyright © 2017. Published by Elsevier B.V.

  14. Water Well Locations - Conservation Wells

    Data.gov (United States)

    NSGIC Education | GIS Inventory — The conservation well layer identifies the permitted surface location of oil and gas conservation wells that have not been plugged. These include active, regulatory...

  15. Characterization of the bovine pregnancy-associated glycoprotein gene family – analysis of gene sequences, regulatory regions within the promoter and expression of selected genes

    Directory of Open Access Journals (Sweden)

    Walker Angela M

    2009-04-01

    Full Text Available Abstract Background The Pregnancy-associated glycoproteins (PAGs belong to a large family of aspartic peptidases expressed exclusively in the placenta of species in the Artiodactyla order. In cattle, the PAG gene family is comprised of at least 22 transcribed genes, as well as some variants. Phylogenetic analyses have shown that the PAG family segregates into 'ancient' and 'modern' groupings. Along with sequence differences between family members, there are clear distinctions in their spatio-temporal distribution and in their relative level of expression. In this report, 1 we performed an in silico analysis of the bovine genome to further characterize the PAG gene family, 2 we scrutinized proximal promoter sequences of the PAG genes to evaluate the evolution pressures operating on them and to identify putative regulatory regions, 3 we determined relative transcript abundance of selected PAGs during pregnancy and, 4 we performed preliminary characterization of the putative regulatory elements for one of the candidate PAGs, bovine (bo PAG-2. Results From our analysis of the bovine genome, we identified 18 distinct PAG genes and 14 pseudogenes. We observed that the first 500 base pairs upstream of the translational start site contained multiple regions that are conserved among all boPAGs. However, a preponderance of conserved regions, that harbor recognition sites for putative transcriptional factors (TFs, were found to be unique to the modern boPAG grouping, but not the ancient boPAGs. We gathered evidence by means of Q-PCR and screening of EST databases to show that boPAG-2 is the most abundant of all boPAG transcripts. Finally, we provided preliminary evidence for the role of ETS- and DDVL-related TFs in the regulation of the boPAG-2 gene. Conclusion PAGs represent a relatively large gene family in the bovine genome. The proximal promoter regions of these genes display differences in putative TF binding sites, likely contributing to observed

  16. The Evolution of the Secreted Regulatory Protein Progranulin.

    Science.gov (United States)

    Palfree, Roger G E; Bennett, Hugh P J; Bateman, Andrew

    2015-01-01

    Progranulin is a secreted growth factor that is active in tumorigenesis, wound repair, and inflammation. Haploinsufficiency of the human progranulin gene, GRN, causes frontotemporal dementia. Progranulins are composed of chains of cysteine-rich granulin modules. Modules may be released from progranulin by proteolysis as 6kDa granulin polypeptides. Both intact progranulin and some of the granulin polypeptides are biologically active. The granulin module occurs in certain plant proteases and progranulins are present in early diverging metazoan clades such as the sponges, indicating their ancient evolutionary origin. There is only one Grn gene in mammalian genomes. More gene-rich Grn families occur in teleost fish with between 3 and 6 members per species including short-form Grns that have no tetrapod counterparts. Our goals are to elucidate progranulin and granulin module evolution by investigating (i): the origins of metazoan progranulins (ii): the evolutionary relationships between the single Grn of tetrapods and the multiple Grn genes of fish (iii): the evolution of granulin module architectures of vertebrate progranulins (iv): the conservation of mammalian granulin polypeptide sequences and how the conserved granulin amino acid sequences map to the known three dimensional structures of granulin modules. We report that progranulin-like proteins are present in unicellular eukaryotes that are closely related to metazoa suggesting that progranulin is among the earliest extracellular regulatory proteins still employed by multicellular animals. From the genomes of the elephant shark and coelacanth we identified contemporary representatives of a precursor for short-from Grn genes of ray-finned fish that is lost in tetrapods. In vertebrate Grns pathways of exon duplication resulted in a conserved module architecture at the amino-terminus that is frequently accompanied by an unusual pattern of tandem nearly identical module repeats near the carboxyl-terminus. Polypeptide

  17. Phylogenetic conservation of the regulatory and functional properties of the Vav oncoprotein family

    International Nuclear Information System (INIS)

    Couceiro, Jose R.; Martin-Bermudo, Maria D.; Bustelo, Xose R.

    2005-01-01

    Vav proteins are phosphorylation-dependent GDP/GTP exchange factors for Rho/Rac GTPases. Despite intense characterization of mammalian Vav proteins both biochemically and genetically, there is little information regarding the conservation of their biological properties in lower organisms. To approach this issue, we have performed a characterization of the regulatory, catalytic, and functional properties of the single Vav family member of Drosophila melanogaster. These analyses have shown that the intramolecular mechanisms controlling the enzyme activity of mammalian Vav proteins are already present in Drosophila, suggesting that such properties have been set up before the divergence between protostomes and deuterostomes during evolution. We also show that Drosophila and mammalian Vav proteins have similar catalytic specificities. As a consequence, Drosophila Vav can trigger oncogenic transformation, morphological change, and enhanced cell motility in mammalian cells. Gain-of-function studies using transgenic flies support the implication of this protein in cytoskeletal-dependent processes such as embryonic dorsal closure, myoblast fusion, tracheal development, and the migration/guidance of different cell types. These results highlight the important roles of Vav proteins in the signal transduction pathways regulating cytoskeletal dynamics. Moreover, they indicate that the foundations for the regulatory and enzymatic activities of this protein family have been set up very early during evolution

  18. Stanniocalcin 1 binds hemin through a partially conserved heme regulatory motif

    Energy Technology Data Exchange (ETDEWEB)

    Westberg, Johan A., E-mail: johan.westberg@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Jiang, Ji, E-mail: ji.jiang@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland); Andersson, Leif C., E-mail: leif.andersson@helsinki.fi [Department of Pathology, Haartman Institute, University of Helsinki and HUSLAB, P.O. Box 21, Haartmaninkatu 3, FI-00014 Helsinki (Finland)

    2011-06-03

    Highlights: {yields} Stanniocalcin 1 (STC1) binds heme through novel heme binding motif. {yields} Central iron atom of heme and cysteine-114 of STC1 are essential for binding. {yields} STC1 binds Fe{sup 2+} and Fe{sup 3+} heme. {yields} STC1 peptide prevents oxidative decay of heme. -- Abstract: Hemin (iron protoporphyrin IX) is a necessary component of many proteins, functioning either as a cofactor or an intracellular messenger. Hemoproteins have diverse functions, such as transportation of gases, gas detection, chemical catalysis and electron transfer. Stanniocalcin 1 (STC1) is a protein involved in respiratory responses of the cell but whose mechanism of action is still undetermined. We examined the ability of STC1 to bind hemin in both its reduced and oxidized states and located Cys{sup 114} as the axial ligand of the central iron atom of hemin. The amino acid sequence differs from the established (Cys-Pro) heme regulatory motif (HRM) and therefore presents a novel heme binding motif (Cys-Ser). A STC1 peptide containing the heme binding sequence was able to inhibit both spontaneous and H{sub 2}O{sub 2} induced decay of hemin. Binding of hemin does not affect the mitochondrial localization of STC1.

  19. Discovery of Conservation and Diversification of miR171 Genes by Phylogenetic Analysis based on Global Genomes

    Directory of Open Access Journals (Sweden)

    Xudong Zhu

    2015-07-01

    Full Text Available The microRNA171 (miR171 family is widely distributed and highly conserved in a range of species and plays critical roles in regulating plant growth and development through repressing expression of ( transcription factors. However, information on the evolutionary conservation and functional diversification of the miRNA171 family members remains scanty. We reconstructed the phylogenetic relationships among miR171 precursor and mature sequences so as to investigate the extent and degree of evolutionary conservation of miR171 in (L. Heynh. (ath, grape ( L. (vvi, poplar ( Torr. & A.Gray ex Hook. (ptc, and rice ( L. (osa. Despite strong conservation of over 80%, some mature miR171 sequences, such as , and and , -, and -, have undergone critical sequence variation, leading to functional diversification, since they target non gene transcript(s. Phylogenetic analyses revealed a combination of old ancestral relationships and recent lineage-specific diversification in the miR171 family within the four model plants. The -regulatory motifs on the upstream promoter sequences of genes were highly divergent and shared some similar elements, indicating their possible contribution to the functional variation observed within the miR171 family. This study will buttress our understanding of the functional differentiation of miRNAs and the relationships of miRNA–target pairs based on the evolutionary history of genes.

  20. Some AFLP amplicons are highly conserved DNA sequences mapping to the same linkage groups in two F2 populations of carrot

    Directory of Open Access Journals (Sweden)

    Santos Carlos A.F.

    2002-01-01

    Full Text Available Amplified fragment length polymorphism (AFLP is a fast and reliable tool to generate a large number of DNA markers. In two unrelated F2 populations of carrot (Daucus carota L., Brasilia x HCM and B493 x QAL (wild carrot, it was hypothesized that DNA 1 digested with the same restriction endonuclease enzymes and amplified with the same primer combination and 2 sharing the same position in polyacrylamide gels should be conserved sequences. To test this hypothesis AFLP fragments from polyacrylamide gels were eluted, reamplified, separated in agarose gels, purified, cloned and sequenced. Among thirty-one paired fragments from each F2 population, twenty-six had identity greater than 91% and five presented identity of 24% to 44%. Among the twenty-six conserved AFLPs only one mapped to different linkage groups in the two populations while four of the five less-conserved bands mapped to different linkage groups. Of eight SCAR (sequence characterized amplified regions primers tested, one conserved AFLP resulted in co-dominant markers in both populations. Screening among 14 carrot inbreds or cultivars with three AFLP-SCAR primers revealed clear and polymorphic PCR products, with similar molecular sizes on agarose gels. The development of co-dominant markers based on conserved AFLP fragments will be useful to detect seed mixtures among hybrids, to improve and to merge linkage maps and to study diversity and phylogenetic relationships.

  1. Divergent evolutionary rates in vertebrate and mammalian specific conserved non-coding elements (CNEs) in echolocating mammals.

    Science.gov (United States)

    Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J

    2014-12-19

    The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise

  2. Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints

    Directory of Open Access Journals (Sweden)

    Xu Fuyu

    2012-09-01

    Full Text Available Abstract Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa formed ten clusters of orthologous groups (COG with genes from the monocot sorghum (Sorghum bicolor and dicot Arabidopsis (Arabidopsis thaliana. The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in

  3. Molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer myostatin gene

    Directory of Open Access Journals (Sweden)

    Smith-Keune Carolyn

    2008-02-01

    Full Text Available Abstract Background Myostatin (MSTN is a member of the transforming growth factor-β superfamily that negatively regulates growth of skeletal muscle tissue. The gene encoding for the MSTN peptide is a consolidate candidate for the enhancement of productivity in terrestrial livestock. This gene potentially represents an important target for growth improvement of cultured finfish. Results Here we report molecular characterization, tissue expression and sequence variability of the barramundi (Lates calcarifer MSTN-1 gene. The barramundi MSTN-1 was encoded by three exons 379, 371 and 381 bp in length and translated into a 376-amino acid peptide. Intron 1 and 2 were 412 and 819 bp in length and presented typical GT...AG splicing sites. The upstream region contained cis-regulatory elements such as TATA-box and E-boxes. A first assessment of sequence variability suggested that higher mutation rates are found in the 5' flanking region with several SNP's present in this species. A putative micro RNA target site has also been observed in the 3'UTR (untranslated region and is highly conserved across teleost fish. The deduced amino acid sequence was conserved across vertebrates and exhibited characteristic conserved putative functional residues including a cleavage motif of proteolysis (RXXR, nine cysteines and two glycosilation sites. A qualitative analysis of the barramundi MSTN-1 expression pattern revealed that, in adult fish, transcripts are differentially expressed in various tissues other than skeletal muscles including gill, heart, kidney, intestine, liver, spleen, eye, gonad and brain. Conclusion Our findings provide valuable insights such as sequence variation and genomic information which will aid the further investigation of the barramundi MSTN-1 gene in association with growth. The finding for the first time in finfish MSTN of a miRNA target site in the 3'UTR provides an opportunity for the identification of regulatory mutations on the

  4. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

    Science.gov (United States)

    Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

    2005-09-01

    We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.

  5. Effects of temperature and mass conservation on the typical chemical sequences of hydrogen oxidation

    Science.gov (United States)

    Nicholson, Schuyler B.; Alaghemandi, Mohammad; Green, Jason R.

    2018-01-01

    Macroscopic properties of reacting mixtures are necessary to design synthetic strategies, determine yield, and improve the energy and atom efficiency of many chemical processes. The set of time-ordered sequences of chemical species are one representation of the evolution from reactants to products. However, only a fraction of the possible sequences is typical, having the majority of the joint probability and characterizing the succession of chemical nonequilibrium states. Here, we extend a variational measure of typicality and apply it to atomistic simulations of a model for hydrogen oxidation over a range of temperatures. We demonstrate an information-theoretic methodology to identify typical sequences under the constraints of mass conservation. Including these constraints leads to an improved ability to learn the chemical sequence mechanism from experimentally accessible data. From these typical sequences, we show that two quantities defining the variational typical set of sequences—the joint entropy rate and the topological entropy rate—increase linearly with temperature. These results suggest that, away from explosion limits, data over a narrow range of thermodynamic parameters could be sufficient to extrapolate these typical features of combustion chemistry to other conditions.

  6. PDL1 Signals through Conserved Sequence Motifs to Overcome Interferon-Mediated Cytotoxicity

    Directory of Open Access Journals (Sweden)

    Maria Gato-Cañas

    2017-08-01

    Full Text Available PDL1 blockade produces remarkable clinical responses, thought to occur by T cell reactivation through prevention of PDL1-PD1 T cell inhibitory interactions. Here, we find that PDL1 cell-intrinsic signaling protects cancer cells from interferon (IFN cytotoxicity and accelerates tumor progression. PDL1 inhibited IFN signal transduction through a conserved class of sequence motifs that mediate crosstalk with IFN signaling. Abrogation of PDL1 expression or antibody-mediated PDL1 blockade strongly sensitized cancer cells to IFN cytotoxicity through a STAT3/caspase-7-dependent pathway. Moreover, somatic mutations found in human carcinomas within these PDL1 sequence motifs disrupted motif regulation, resulting in PDL1 molecules with enhanced protective activities from type I and type II IFN cytotoxicity. Overall, our results reveal a mode of action of PDL1 in cancer cells as a first line of defense against IFN cytotoxicity.

  7. Sequence-based model of gap gene regulatory network.

    Science.gov (United States)

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  8. Conserved cell cycle regulatory properties within the amino terminal domain of the Epstein-Barr virus nuclear antigen 3C

    International Nuclear Information System (INIS)

    Sharma, Nikhil; Knight, Jason S.; Robertson, Erle S.

    2006-01-01

    The gammaherpesviruses Rhesus lymphocryptovirus (LCV) and Epstein-Barr virus (EBV) are closely related phylogenetically. Rhesus LCV efficiently immortalizes Rhesus B cells in vitro. However, despite a high degree of conservation between the Rhesus LCV and EBV genomes, Rhesus LCV fails to immortalize human B cells in vitro. This species restriction may, at least in part, be linked to the EBV nuclear antigens (EBNAs) and latent membrane proteins (LMPs), known to be essential for B cell transformation. We compared specific properties of EBNA3C, a well-characterized and essential EBV protein, with its Rhesus counterpart to determine whether EBNA3C phenotypes which contribute to cell cycle regulation are conserved in the Rhesus LCV. We show that both EBNA3C and Rhesus EBNA3C bind to a conserved region of mammalian cyclins, regulate pRb stability, and modulate SCF Skp2 -dependent ubiquitination. These results suggest that Rhesus LCV restriction from human B cell immortalization is independent of the conserved cell cycle regulatory functions of the EBNA3C protein

  9. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  10. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  11. Discovery of cis-elements between sorghum and rice using co-expression and evolutionary conservation

    Directory of Open Access Journals (Sweden)

    Haberer Georg

    2009-06-01

    Full Text Available Abstract Background The spatiotemporal regulation of gene expression largely depends on the presence and absence of cis-regulatory sites in the promoter. In the economically highly important grass family, our knowledge of transcription factor binding sites and transcriptional networks is still very limited. With the completion of the sorghum genome and the available rice genome sequence, comparative promoter analyses now allow genome-scale detection of conserved cis-elements. Results In this study, we identified thousands of phylogenetic footprints conserved between orthologous rice and sorghum upstream regions that are supported by co-expression information derived from three different rice expression data sets. In a complementary approach, cis-motifs were discovered by their highly conserved co-occurrence in syntenic promoter pairs. Sequence conservation and matches to known plant motifs support our findings. Expression similarities of gene pairs positively correlate with the number of motifs that are shared by gene pairs and corroborate the importance of similar promoter architectures for concerted regulation. This strongly suggests that these motifs function in the regulation of transcript levels in rice and, presumably also in sorghum. Conclusion Our work provides the first large-scale collection of cis-elements for rice and sorghum and can serve as a paradigm for cis-element analysis through comparative genomics in grasses in general.

  12. GANN: Genetic algorithm neural networks for the detection of conserved combinations of features in DNA

    Directory of Open Access Journals (Sweden)

    Beiko Robert G

    2005-02-01

    Full Text Available Abstract Background The multitude of motif detection algorithms developed to date have largely focused on the detection of patterns in primary sequence. Since sequence-dependent DNA structure and flexibility may also play a role in protein-DNA interactions, the simultaneous exploration of sequence- and structure-based hypotheses about the composition of binding sites and the ordering of features in a regulatory region should be considered as well. The consideration of structural features requires the development of new detection tools that can deal with data types other than primary sequence. Results GANN (available at http://bioinformatics.org.au/gann is a machine learning tool for the detection of conserved features in DNA. The software suite contains programs to extract different regions of genomic DNA from flat files and convert these sequences to indices that reflect sequence and structural composition or the presence of specific protein binding sites. The machine learning component allows the classification of different types of sequences based on subsamples of these indices, and can identify the best combinations of indices and machine learning architecture for sequence discrimination. Another key feature of GANN is the replicated splitting of data into training and test sets, and the implementation of negative controls. In validation experiments, GANN successfully merged important sequence and structural features to yield good predictive models for synthetic and real regulatory regions. Conclusion GANN is a flexible tool that can search through large sets of sequence and structural feature combinations to identify those that best characterize a set of sequences.

  13. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

    Directory of Open Access Journals (Sweden)

    Diana S José-Edwards

    2015-12-01

    Full Text Available A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs, and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.

  14. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

    Science.gov (United States)

    José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

    2015-12-01

    A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs.

  15. Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

    Science.gov (United States)

    Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

    2010-10-15

    The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV

  16. Expression profiling and comparative sequence derived insights into lipid metabolism

    Energy Technology Data Exchange (ETDEWEB)

    Callow, Matthew J.; Rubin, Edward M.

    2001-12-19

    Expression profiling and genomic DNA sequence comparisons are increasingly being applied to the identification and analysis of the genes involved in lipid metabolism. Not only has genome-wide expression profiling aided in the identification of novel genes involved in important processes in lipid metabolism such as sterol efflux, but the utilization of information from these studies has added to our understanding of the regulation of pathways participating in the process. Coupled with these gene expression studies, cross species comparison, searching for sequences conserved through evolution, has proven to be a powerful tool to identify important non-coding regulatory sequences as well as the discovery of novel genes relevant to lipid biology. An example of the value of this approach was the recent chance discovery of a new apolipoprotein gene (apo AV) that has dramatic effects upon triglyceride metabolism in mice and humans.

  17. The identification and functional annotation of RNA structures conserved in vertebrates.

    Science.gov (United States)

    Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan

    2017-08-01

    Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.

  18. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    Science.gov (United States)

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  19. Regulatory sequences driving expression of the sea urchin Otp homeobox gene in oral ectoderm cells.

    Science.gov (United States)

    Cavalieri, Vincenzo; Bernardo, Maria Di; Spinelli, Giovanni

    2007-01-01

    PlOtp (Orthopedia), a homeodomain-containing transcription factor, has been recently characterized as a key regulator of the morphogenesis of the skeletal system in the embryo of the sea urchin Paracentrotus lividus. Otp acts as a positive regulator in a subset of oral ectodermal cells which transmit short-range signals to the underlying primary mesenchyme cells where skeletal synthesis is initiated. To shed some light on the molecular mechanisms involved in such a process, we begun a functional analysis of the cis-regulatory sequences of the Otp gene. Congruent with the spatial expression profile of the endogenous Otp gene, we found that while a DNA region from -494 to +358 is shown to drive in vivo GFP reporter expression in the oral ectoderm, but also in the foregut, a larger region spanning from -2044 to +358 is needed to give firmly established tissue specificity. Microinjection of PCR-amplified DNA constructs, truncated in the 5' regulatory region, and determination of GFP mRNA level in injected embryos allowed the identification of a 5'-flanking fragment of 184bp in length, essential for expression of the transgene in the oral ectoderm of pluteus stage embryos. Finally, we conducted DNAse I-footprinting assays in nuclear extracts for the 184bp region and detected two protected sequences. Data bank search indicates that these sites contain consensus binding sites for transcription factors.

  20. Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins.

    Directory of Open Access Journals (Sweden)

    David Karlin

    Full Text Available Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11-16aa, several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains that could be detected simply by comparing orthologous proteins.

  1. CONDOR: a database resource of developmentally associated conserved non-coding elements

    Directory of Open Access Journals (Sweden)

    Smith Sarah

    2007-08-01

    Full Text Available Abstract Background Comparative genomics is currently one of the most popular approaches to study the regulatory architecture of vertebrate genomes. Fish-mammal genomic comparisons have proved powerful in identifying conserved non-coding elements likely to be distal cis-regulatory modules such as enhancers, silencers or insulators that control the expression of genes involved in the regulation of early development. The scientific community is showing increasing interest in characterizing the function, evolution and language of these sequences. Despite this, there remains little in the way of user-friendly access to a large dataset of such elements in conjunction with the analysis and the visualization tools needed to study them. Description Here we present CONDOR (COnserved Non-coDing Orthologous Regions available at: http://condor.fugu.biology.qmul.ac.uk. In an interactive and intuitive way the website displays data on > 6800 non-coding elements associated with over 120 early developmental genes and conserved across vertebrates. The database regularly incorporates results of ongoing in vivo zebrafish enhancer assays of the CNEs carried out in-house, which currently number ~100. Included and highlighted within this set are elements derived from duplication events both at the origin of vertebrates and more recently in the teleost lineage, thus providing valuable data for studying the divergence of regulatory roles between paralogs. CONDOR therefore provides a number of tools and facilities to allow scientists to progress in their own studies on the function and evolution of developmental cis-regulation. Conclusion By providing access to data with an approachable graphics interface, the CONDOR database presents a rich resource for further studies into the regulation and evolution of genes involved in early development.

  2. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Cannon Charles H

    2011-07-01

    Full Text Available Abstract Background Acacia auriculiformis × Acacia mangium hybrids are commercially important trees for the timber and pulp industry in Southeast Asia. Increasing pulp yield while reducing pulping costs are major objectives of tree breeding programs. The general monolignol biosynthesis and secondary cell wall formation pathways are well-characterized but genes in these pathways are poorly characterized in Acacia hybrids. RNA-seq on short-read platforms is a rapid approach for obtaining comprehensive transcriptomic data and to discover informative sequence variants. Results We sequenced transcriptomes of A. auriculiformis and A. mangium from non-normalized cDNA libraries synthesized from pooled young stem and inner bark tissues using paired-end libraries and a single lane of an Illumina GAII machine. De novo assembly produced a total of 42,217 and 35,759 contigs with an average length of 496 bp and 498 bp for A. auriculiformis and A. mangium respectively. The assemblies of A. auriculiformis and A. mangium had a total length of 21,022,649 bp and 17,838,260 bp, respectively, with the largest contig 15,262 bp long. We detected all ten monolignol biosynthetic genes using Blastx and further analysis revealed 18 lignin isoforms for each species. We also identified five contigs homologous to R2R3-MYB proteins in other plant species that are involved in transcriptional regulation of secondary cell wall formation and lignin deposition. We searched the contigs against public microRNA database and predicted the stem-loop structures of six highly conserved microRNA families (miR319, miR396, miR160, miR172, miR162 and miR168 and one legume-specific family (miR2086. Three microRNA target genes were predicted to be involved in wood formation and flavonoid biosynthesis. By using the assemblies as a reference, we discovered 16,648 and 9,335 high quality putative Single Nucleotide Polymorphisms (SNPs in the transcriptomes of A. auriculiformis and A. mangium

  3. The Evolution of the Secreted Regulatory Protein Progranulin.

    Directory of Open Access Journals (Sweden)

    Roger G E Palfree

    Full Text Available Progranulin is a secreted growth factor that is active in tumorigenesis, wound repair, and inflammation. Haploinsufficiency of the human progranulin gene, GRN, causes frontotemporal dementia. Progranulins are composed of chains of cysteine-rich granulin modules. Modules may be released from progranulin by proteolysis as 6kDa granulin polypeptides. Both intact progranulin and some of the granulin polypeptides are biologically active. The granulin module occurs in certain plant proteases and progranulins are present in early diverging metazoan clades such as the sponges, indicating their ancient evolutionary origin. There is only one Grn gene in mammalian genomes. More gene-rich Grn families occur in teleost fish with between 3 and 6 members per species including short-form Grns that have no tetrapod counterparts. Our goals are to elucidate progranulin and granulin module evolution by investigating (i: the origins of metazoan progranulins (ii: the evolutionary relationships between the single Grn of tetrapods and the multiple Grn genes of fish (iii: the evolution of granulin module architectures of vertebrate progranulins (iv: the conservation of mammalian granulin polypeptide sequences and how the conserved granulin amino acid sequences map to the known three dimensional structures of granulin modules. We report that progranulin-like proteins are present in unicellular eukaryotes that are closely related to metazoa suggesting that progranulin is among the earliest extracellular regulatory proteins still employed by multicellular animals. From the genomes of the elephant shark and coelacanth we identified contemporary representatives of a precursor for short-from Grn genes of ray-finned fish that is lost in tetrapods. In vertebrate Grns pathways of exon duplication resulted in a conserved module architecture at the amino-terminus that is frequently accompanied by an unusual pattern of tandem nearly identical module repeats near the carboxyl

  4. 14 CFR 313.4 - Major regulatory actions.

    Science.gov (United States)

    2010-01-01

    ...) PROCEDURAL REGULATIONS IMPLEMENTATION OF THE ENERGY POLICY AND CONSERVATION ACT § 313.4 Major regulatory... of actions shall not be deemed as major regulatory actions requiring an energy statement: (1) Tariff...

  5. Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

    Science.gov (United States)

    Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

    2017-10-17

    Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.

  6. Multi-species sequence comparison reveals conservation of ghrelin gene-derived splice variants encoding a truncated ghrelin peptide.

    Science.gov (United States)

    Seim, Inge; Jeffery, Penny L; Thomas, Patrick B; Walpole, Carina M; Maugham, Michelle; Fung, Jenny N T; Yap, Pei-Yi; O'Keeffe, Angela J; Lai, John; Whiteside, Eliza J; Herington, Adrian C; Chopin, Lisa K

    2016-06-01

    The peptide hormone ghrelin is a potent orexigen produced predominantly in the stomach. It has a number of other biological actions, including roles in appetite stimulation, energy balance, the stimulation of growth hormone release and the regulation of cell proliferation. Recently, several ghrelin gene splice variants have been described. Here, we attempted to identify conserved alternative splicing of the ghrelin gene by cross-species sequence comparisons. We identified a novel human exon 2-deleted variant and provide preliminary evidence that this splice variant and in1-ghrelin encode a C-terminally truncated form of the ghrelin peptide, termed minighrelin. These variants are expressed in humans and mice, demonstrating conservation of alternative splicing spanning 90 million years. Minighrelin appears to have similar actions to full-length ghrelin, as treatment with exogenous minighrelin peptide stimulates appetite and feeding in mice. Forced expression of the exon 2-deleted preproghrelin variant mirrors the effect of the canonical preproghrelin, stimulating cell proliferation and migration in the PC3 prostate cancer cell line. This is the first study to characterise an exon 2-deleted preproghrelin variant and to demonstrate sequence conservation of ghrelin gene-derived splice variants that encode a truncated ghrelin peptide. This adds further impetus for studies into the alternative splicing of the ghrelin gene and the function of novel ghrelin peptides in vertebrates.

  7. Massive contribution of transposable elements to mammalian regulatory sequences.

    Science.gov (United States)

    Rayan, Nirmala Arul; Del Rosario, Ricardo C H; Prabhakar, Shyam

    2016-09-01

    Barbara McClintock discovered the existence of transposable elements (TEs) in the late 1940s and initially proposed that they contributed to the gene regulatory program of higher organisms. This controversial idea gained acceptance only much later in the 1990s, when the first examples of TE-derived promoter sequences were uncovered. It is now known that half of the human genome is recognizably derived from TEs. It is thus important to understand the scope and nature of their contribution to gene regulation. Here, we provide a timeline of major discoveries in this area and discuss how transposons have revolutionized our understanding of mammalian genomes, with a special emphasis on the massive contribution of TEs to primate evolution. Our analysis of primate-specific functional elements supports a simple model for the rate at which new functional elements arise in unique and TE-derived DNA. Finally, we discuss some of the challenges and unresolved questions in the field, which need to be addressed in order to fully characterize the impact of TEs on gene regulation, evolution and disease processes. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Defining the plasticity of transcription factor binding sites by Deconstructing DNA consensus sequences: the PhoP-binding sites among gamma/enterobacteria.

    Directory of Open Access Journals (Sweden)

    Oscar Harari

    2010-07-01

    Full Text Available Transcriptional regulators recognize specific DNA sequences. Because these sequences are embedded in the background of genomic DNA, it is hard to identify the key cis-regulatory elements that determine disparate patterns of gene expression. The detection of the intra- and inter-species differences among these sequences is crucial for understanding the molecular basis of both differential gene expression and evolution. Here, we address this problem by investigating the target promoters controlled by the DNA-binding PhoP protein, which governs virulence and Mg(2+ homeostasis in several bacterial species. PhoP is particularly interesting; it is highly conserved in different gamma/enterobacteria, regulating not only ancestral genes but also governing the expression of dozens of horizontally acquired genes that differ from species to species. Our approach consists of decomposing the DNA binding site sequences for a given regulator into families of motifs (i.e., termed submotifs using a machine learning method inspired by the "Divide & Conquer" strategy. By partitioning a motif into sub-patterns, computational advantages for classification were produced, resulting in the discovery of new members of a regulon, and alleviating the problem of distinguishing functional sites in chromatin immunoprecipitation and DNA microarray genome-wide analysis. Moreover, we found that certain partitions were useful in revealing biological properties of binding site sequences, including modular gains and losses of PhoP binding sites through evolutionary turnover events, as well as conservation in distant species. The high conservation of PhoP submotifs within gamma/enterobacteria, as well as the regulatory protein that recognizes them, suggests that the major cause of divergence between related species is not due to the binding sites, as was previously suggested for other regulators. Instead, the divergence may be attributed to the fast evolution of orthologous target

  9. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  10. Search for 5'-leader regulatory RNA structures based on gene annotation aided by the RiboGap database.

    Science.gov (United States)

    Naghdi, Mohammad Reza; Smail, Katia; Wang, Joy X; Wade, Fallou; Breaker, Ronald R; Perreault, Jonathan

    2017-03-15

    The discovery of noncoding RNAs (ncRNAs) and their importance for gene regulation led us to develop bioinformatics tools to pursue the discovery of novel ncRNAs. Finding ncRNAs de novo is challenging, first due to the difficulty of retrieving large numbers of sequences for given gene activities, and second due to exponential demands on calculation needed for comparative genomics on a large scale. Recently, several tools for the prediction of conserved RNA secondary structure were developed, but many of them are not designed to uncover new ncRNAs, or are too slow for conducting analyses on a large scale. Here we present various approaches using the database RiboGap as a primary tool for finding known ncRNAs and for uncovering simple sequence motifs with regulatory roles. This database also can be used to easily extract intergenic sequences of eubacteria and archaea to find conserved RNA structures upstream of given genes. We also show how to extend analysis further to choose the best candidate ncRNAs for experimental validation. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva

    Directory of Open Access Journals (Sweden)

    Guo Xiang

    2008-12-01

    regulatory motifs in other species. These results suggest that these two motifs are likely to represent transcription factor binding sites in Theileria. Conclusion Theileria genomes are highly compact, with selection seemingly favoring short introns and intergenic regions. Three over-represented sequence motifs were independently identified in intergenic regions of both Theileria species, and the evidence suggests that at least two of them play a role in transcriptional control in T. parva. These are prime candidates for experimental validation of transcription factor binding sites in this single-celled eukaryotic parasite. Sequences similar to two of these Theileria motifs are conserved in Plasmodium hinting at the possibility of common regulatory machinery across the phylum Apicomplexa.

  12. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    Science.gov (United States)

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence

  13. Functional conservation of the Drosophila gooseberry gene and its evolutionary alleles.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available The Drosophila Pax gene gooseberry (gsb is required for development of the larval cuticle and CNS, survival to adulthood, and male fertility. These functions can be rescued in gsb mutants by two gsb evolutionary alleles, gsb-Prd and gsb-Pax3, which express the Drosophila Paired and mouse Pax3 proteins under the control of gooseberry cis-regulatory region. Therefore, both Paired and Pax3 proteins have conserved all the Gsb functions that are required for survival of embryos to fertile adults, despite the divergent primary sequences in their C-terminal halves. As gsb-Prd and gsb-Pax3 uncover a gsb function involved in male fertility, construction of evolutionary alleles may provide a powerful strategy to dissect hitherto unknown gene functions. Our results provide further evidence for the essential role of cis-regulatory regions in the functional diversification of duplicated genes during evolution.

  14. African wildlife conservation and the evolution of hunting institutions

    Science.gov (United States)

    't Sas-Rolfes, Michael

    2017-11-01

    Hunting regulation presents a significant challenge for contemporary global conservation governance. Motivated by various incentives, hunters may act legally or illegally, for or against the interests of conservation. Hunter incentives are shaped by the interactions between unevenly evolving formal and informal institutions, embedded in socio-ecological systems. To work effectively for conservation, regulatory interventions must take these evolving institutional interactions into account. Drawing on analytical tools from evolutionary institutional economics, this article examines the trajectory of African hunting regulation and its consequences. Concepts of institutional dynamics, fit, scale, and interplay are applied to case studies of rhinoceros and lion hunting to highlight issues of significance to conservation outcomes. These include important links between different forms of hunting and dynamic interplay with institutions of trade. The case studies reveal that inappropriate formal regulatory approaches may be undermined by adaptive informal market responses. Poorly regulated hunting may lead to calls for stricter regulations or bans, but such legal restrictions may in turn perversely lead to more intensified and organised illegal hunting activity, further undermining conservation objectives. I conclude by offering insights and recommendations to guide more effective future regulatory interventions and priorities for further research. Specifically, I advocate approaches that move beyond simplistic regulatory interventions toward more complex, but supportive, institutional arrangements that align formal and informal institutions through inclusive stakeholder engagement.

  15. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis.

    Science.gov (United States)

    Arsovski, Andrej A; Pradinuk, Julian; Guo, Xu Qiu; Wang, Sishuo; Adams, Keith L

    2015-12-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. © 2015 American Society of Plant Biologists. All Rights Reserved.

  16. Conservation and diversification of Msx protein in metazoan evolution.

    Science.gov (United States)

    Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun

    2008-01-01

    Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family

  17. Evolutionary conservation of nuclear and nucleolar targeting sequences in yeast ribosomal protein S6A

    International Nuclear Information System (INIS)

    Lipsius, Edgar; Walter, Korden; Leicher, Torsten; Phlippen, Wolfgang; Bisotti, Marc-Angelo; Kruppa, Joachim

    2005-01-01

    Over 1 billion years ago, the animal kingdom diverged from the fungi. Nevertheless, a high sequence homology of 62% exists between human ribosomal protein S6 and S6A of Saccharomyces cerevisiae. To investigate whether this similarity in primary structure is mirrored in corresponding functional protein domains, the nuclear and nucleolar targeting signals were delineated in yeast S6A and compared to the known human S6 signals. The complete sequence of S6A and cDNA fragments was fused to the 5'-end of the LacZ gene, the constructs were transiently expressed in COS cells, and the subcellular localization of the fusion proteins was detected by indirect immunofluorescence. One bipartite and two monopartite nuclear localization signals as well as two nucleolar binding domains were identified in yeast S6A, which are located at homologous regions in human S6 protein. Remarkably, the number, nature, and position of these targeting signals have been conserved, albeit their amino acid sequences have presumably undergone a process of co-evolution with their corresponding rRNAs

  18. The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences

    Science.gov (United States)

    Portales-Casamar, Elodie; Arenillas, David; Lim, Jonathan; Swanson, Magdalena I.; Jiang, Steven; McCallum, Anthony; Kirov, Stefan; Wasserman, Wyeth W.

    2009-01-01

    The PAZAR database unites independently created and maintained data collections of transcription factor and regulatory sequence annotation. The flexible PAZAR schema permits the representation of diverse information derived from experiments ranging from biochemical protein–DNA binding to cellular reporter gene assays. Data collections can be made available to the public, or restricted to specific system users. The data ‘boutiques’ within the shopping-mall-inspired system facilitate the analysis of genomics data and the creation of predictive models of gene regulation. Since its initial release, PAZAR has grown in terms of data, features and through the addition of an associated package of software tools called the ORCA toolkit (ORCAtk). ORCAtk allows users to rapidly develop analyses based on the information stored in the PAZAR system. PAZAR is available at http://www.pazar.info. ORCAtk can be accessed through convenient buttons located in the PAZAR pages or via our website at http://www.cisreg.ca/ORCAtk. PMID:18971253

  19. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    Science.gov (United States)

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  20. Assessing the structural conservation of protein pockets to study functional and allosteric sites: implications for drug discovery

    Directory of Open Access Journals (Sweden)

    Daura Xavier

    2010-03-01

    Full Text Available Abstract Background With the classical, active-site oriented drug-development approach reaching its limits, protein ligand-binding sites in general and allosteric sites in particular are increasingly attracting the interest of medicinal chemists in the search for new types of targets and strategies to drug development. Given that allostery represents one of the most common and powerful means to regulate protein function, the traditional drug discovery approach of targeting active sites can be extended by targeting allosteric or regulatory protein pockets that may allow the discovery of not only novel drug-like inhibitors, but activators as well. The wealth of available protein structural data can be exploited to further increase our understanding of allosterism, which in turn may have therapeutic applications. A first step in this direction is to identify and characterize putative effector sites that may be present in already available structural data. Results We performed a large-scale study of protein cavities as potential allosteric and functional sites, by integrating publicly available information on protein sequences, structures and active sites for more than a thousand protein families. By identifying common pockets across different structures of the same protein family we developed a method to measure the pocket's structural conservation. The method was first parameterized using known active sites. We characterized the predicted pockets in terms of sequence and structural conservation, backbone flexibility and electrostatic potential. Although these different measures do not tend to correlate, their combination is useful in selecting functional and regulatory sites, as a detailed analysis of a handful of protein families shows. We finally estimated the numbers of potential allosteric or regulatory pockets that may be present in the data set, finding that pockets with putative functional and effector characteristics are widespread across

  1. In Vivo Characterization of a Vertebrate Ultra-conserved Enhancer

    Energy Technology Data Exchange (ETDEWEB)

    Poulin, Francis; Nobrega, Marcelo A.; Plajzer-Frick, Ingrid; Holt, Amy; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len

    2004-10-01

    Genomic sequence comparisons between human, mouse and pufferfish (Takifugu rubripes (Fugu))have revealed a set of extremely conserved noncoding sequences. While this high degree of sequence conservation suggests severe evolutionary constraint and predicts a lack of tolerance to change in order to retain in vivo functionality, such elements have been minimally explored experimentally. In this study, we describe the in-depth characterization of an ancient conserved enhancer, Dc2 located near the dachshund gene, which displays a human-Fugu identity of 84 percent over 424 basepairs (bp). In addition to this large overall conservation, we find that Dc2 is characterized by the presence of a large block of sequence (144 bp) that is completely identical between human, mouse, chicken, zebrafish and Fugu. Through the testing of reporter vector constructs in transgenic mice, we observed that the 424 bp Dc2 conserved element is necessary and sufficient for brain tissue enhancer activity. In vivo analyses also revealed that the 144 bp 100 percent conserved sequence is necessary, but not sufficient, to replicate Dc2 enhancer function. However, the introduction of two separate 16 bp insertions into the highly conserved enhancer core did not cause any detectable modification of its in vivo activity. Our observations indicate that the 144 bp 100 percent conserved element is tolerant of change at least at the resolution of this transgenic mouse assay and suggest that purifying selection on Dc2 sequence might not be as strong as we predicted or that some unknown property also constrains this highly conserved enhancer sequence.

  2. Functional Conservation of the Glide/Gcm Regulatory Network Controlling Glia, Hemocyte, and Tendon Cell Differentiation in Drosophila

    Science.gov (United States)

    Cattenoz, Pierre B.; Popkova, Anna; Southall, Tony D.; Aiello, Giuseppe; Brand, Andrea H.; Giangrande, Angela

    2016-01-01

    High-throughput screens allow us to understand how transcription factors trigger developmental processes, including cell specification. A major challenge is identification of their binding sites because feedback loops and homeostatic interactions may mask the direct impact of those factors in transcriptome analyses. Moreover, this approach dissects the downstream signaling cascades and facilitates identification of conserved transcriptional programs. Here we show the results and the validation of a DNA adenine methyltransferase identification (DamID) genome-wide screen that identifies the direct targets of Glide/Gcm, a potent transcription factor that controls glia, hemocyte, and tendon cell differentiation in Drosophila. The screen identifies many genes that had not been previously associated with Glide/Gcm and highlights three major signaling pathways interacting with Glide/Gcm: Notch, Hedgehog, and JAK/STAT, which all involve feedback loops. Furthermore, the screen identifies effector molecules that are necessary for cell-cell interactions during late developmental processes and/or in ontogeny. Typically, immunoglobulin (Ig) domain–containing proteins control cell adhesion and axonal navigation. This shows that early and transiently expressed fate determinants not only control other transcription factors that, in turn, implement a specific developmental program but also directly affect late developmental events and cell function. Finally, while the mammalian genome contains two orthologous Gcm genes, their function has been demonstrated in vertebrate-specific tissues, placenta, and parathyroid glands, begging questions on the evolutionary conservation of the Gcm cascade in higher organisms. Here we provide the first evidence for the conservation of Gcm direct targets in humans. In sum, this work uncovers novel aspects of cell specification and sets the basis for further understanding of the role of conserved Gcm gene regulatory cascades. PMID:26567182

  3. Structural and functional conservation of CLEC-2 with the species-specific regulation of transcript expression in evolution.

    Science.gov (United States)

    Wang, Lan; Ren, Shifang; Zhu, Haiyan; Zhang, Dongmei; Hao, Yuqing; Ruan, Yuanyuan; Zhou, Lei; Lee, Chiayu; Qiu, Lin; Yun, Xiaojing; Xie, Jianhui

    2012-08-01

    CLEC-2 was first identified by sequence similarity to C-type lectin-like molecules with immune functions and has been reported as a receptor for the platelet-aggregating snake venom toxin rhodocytin and the endogenous sialoglycoprotein podoplanin. Recent researches indicate that CLEC-2-deficient mice were lethal at the embryonic stage associated with disorganized and blood-filled lymphatic vessels and severe edema. In view of a necessary role of CLEC-2 in the individual development, it is of interest to investigate its phylogenetic homology and highly conserved functional regions. In this work, we reported that CLEC-2 from different species holds with an extraordinary conservation by sequence alignment and phylogenetic tree analysis. The functional structures including N-linked oligosaccharide sites and ligand-binding domain implement a structural and functional conservation in a variety of species. The glycosylation sites (N120 and N134) are necessary for the surface expression CLEC-2. CLEC-2 from different species possesses the binding activity of mouse podoplanin. Nevertheless, the expression of CLEC-2 is regulated with a species-specific manner. The alternative splicing of pre-mRNA, a regulatory mechanism of gene expression, and the binding sites on promoter for several key transcription factors vary between different species. Therefore, CLEC-2 shares high sequence homology and functional identity. However the transcript expression might be tightly regulated by different mechanisms in evolution.

  4. In situ detection of a heat-shock regulatory element binding protein using a soluble short synthetic enhancer sequence

    Energy Technology Data Exchange (ETDEWEB)

    Harel-Bellan, A; Brini, A T; Farrar, W L [National Cancer Institute, Frederick, MD (USA); Ferris, D K [Program Resources, Inc., Frederick, MD (USA); Robin, P [Institut Gustave Roussy, Villejuif (France)

    1989-06-12

    In various studies, enhancer binding proteins have been successfully absorbed out by competing sequences inserted into plasmids, resulting in the inhibition of the plasmid expression. Theoretically, such a result could be achieved using synthetic enhancer sequences not inserted into plasmids. In this study, a double stranded DNA sequence corresponding to the human heat shock regulatory element was chemically synthesized. By in vitro retardation assays, the synthetic sequence was shown to bind specifically a protein in extracts from the human T cell line Jurkat. When the synthetic enhancer was electroporated into Jurkat cells, not only the enhancer was shown to remain undegraded into the cells for up to 2 days, but also its was shown to bind intracellularly a protein. The binding was specific and was modulated upon heat shock. Furthermore, the binding protein was shown to be of the expected molecular weight by UV crosslinking. However, when the synthetic enhancer element was co-electroporated with an HSP 70-CAT reporter construct, the expression of the reporter plasmid was consistently enhanced in the presence of the exogenous synthetic enhancer.

  5. Mutational robustness of gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor-target gene interactions but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive. In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.

  6. CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation.

    Science.gov (United States)

    Nikulova, Anna A; Favorov, Alexander V; Sutormin, Roman A; Makeev, Vsevolod J; Mironov, Andrey A

    2012-07-01

    Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory 'grammar', or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.

  7. SNPs in Multi-Species Conserved Sequences (MCS as useful markers in association studies: a practical approach

    Directory of Open Access Journals (Sweden)

    Pericak-Vance Margaret A

    2007-08-01

    Full Text Available Abstract Background Although genes play a key role in many complex diseases, the specific genes involved in most complex diseases remain largely unidentified. Their discovery will hinge on the identification of key sequence variants that are conclusively associated with disease. While much attention has been focused on variants in protein-coding DNA, variants in noncoding regions may also play many important roles in complex disease by altering gene regulation. Since the vast majority of noncoding genomic sequence is of unknown function, this increases the challenge of identifying "functional" variants that cause disease. However, evolutionary conservation can be used as a guide to indicate regions of noncoding or coding DNA that are likely to have biological function, and thus may be more likely to harbor SNP variants with functional consequences. To help bias marker selection in favor of such variants, we devised a process that prioritizes annotated SNPs for genotyping studies based on their location within Multi-species Conserved Sequences (MCSs and used this process to select SNPs in a region of linkage to a complex disease. This allowed us to evaluate the utility of the chosen SNPs for further association studies. Previously, a region of chromosome 1q43 was linked to Multiple Sclerosis (MS in a genome-wide screen. We chose annotated SNPs in the region based on location within MCSs (termed MCS-SNPs. We then obtained genotypes for 478 MCS-SNPs in 989 individuals from MS families. Results Analysis of our MCS-SNP genotypes from the 1q43 region and comparison to HapMap data confirmed that annotated SNPs in MCS regions are frequently polymorphic and show subtle signatures of selective pressure, consistent with previous reports of genome-wide variation in conserved regions. We also present an online tool that allows MCS data to be directly exported to the UCSC genome browser so that MCS-SNPs can be easily identified within genomic regions of

  8. Repetitive sequences: the hidden diversity of heterochromatin in prochilodontid fish

    Directory of Open Access Journals (Sweden)

    Maria L. Terencio

    2015-08-01

    Full Text Available The structure and organization of repetitive elements in fish genomes are still relatively poorly understood, although most of these elements are believed to be located in heterochromatic regions. Repetitive elements are considered essential in evolutionary processes as hotspots for mutations and chromosomal rearrangements, among other functions – thus providing new genomic alternatives and regulatory sites for gene expression. The present study sought to characterize repetitive DNA sequences in the genomes of Semaprochilodus insignis (Jardine & Schomburgk, 1841 and Semaprochilodus taeniurus (Valenciennes, 1817 and identify regions of conserved syntenic blocks in this genome fraction of three species of Prochilodontidae (S. insignis, S. taeniurus, and Prochilodus lineatus (Valenciennes, 1836 by cross-FISH using Cot-1 DNA (renaturation kinetics probes. We found that the repetitive fractions of the genomes of S. insignis and S. taeniurus have significant amounts of conserved syntenic blocks in hybridization sites, but with low degrees of similarity between them and the genome of P. lineatus, especially in relation to B chromosomes. The cloning and sequencing of the repetitive genomic elements of S. insignis and S. taeniurus using Cot-1 DNA identified 48 fragments that displayed high similarity with repetitive sequences deposited in public DNA databases and classified as microsatellites, transposons, and retrotransposons. The repetitive fractions of the S. insignis and S. taeniurus genomes exhibited high degrees of conserved syntenic blocks in terms of both the structures and locations of hybridization sites, but a low degree of similarity with the syntenic blocks of the P. lineatus genome. Future comparative analyses of other prochilodontidae species will be needed to advance our understanding of the organization and evolution of the genomes in this group of fish.

  9. Overlapping positive and negative regulatory domains of the human β-interferon gene

    International Nuclear Information System (INIS)

    Goodbourn, S.; Maniatis, T.

    1988-01-01

    Virus of poly(I) x poly(C) induction of human β-interferon gene expression requires a 40-base-pair DNA sequence designated the interferon gene regulatory element (IRE). Previous studies have shown that the IRE contains both positive and negative regulatory DNA sequences. To localize these sequences and study their interactions, the authors have examined the effects of a large number of single-base mutations within the IRE on β-interferon gene regulation. They find that the IRE consists of two genetically separable positive regulatory domains and an overlapping negative control sequence. They propose that the β-interferon gene is switched off in uninduced cells by a repressor that blocks the interaction between one of the two positive regulatory sequences and a specific transcription factor. Induction would then lead to inactivation or displacement of the repressor and binding of transcription factors to both positive regulatory domains

  10. Conserved PCR primer set designing for closely-related species to complete mitochondrial genome sequencing using a sliding window-based PSO algorithm.

    Directory of Open Access Journals (Sweden)

    Cheng-Hong Yang

    Full Text Available BACKGROUND: Complete mitochondrial (mt genome sequencing is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. For long template sequencing, i.e., like the entire mtDNA, it is essential to design primers for Polymerase Chain Reaction (PCR amplicons which are partly overlapping each other. The presented chromosome walking strategy provides the overlapping design to solve the problem for unreliable sequencing data at the 5' end and provides the effective sequencing. However, current algorithms and tools are mostly focused on the primer design for a local region in the genomic sequence. Accordingly, it is still challenging to provide the primer sets for the entire mtDNA. METHODOLOGY/PRINCIPAL FINDINGS: The purpose of this study is to develop an integrated primer design algorithm for entire mt genome in general, and for the common primer sets for closely-related species in particular. We introduce ClustalW to generate the multiple sequence alignment needed to find the conserved sequences in closely-related species. These conserved sequences are suitable for designing the common primers for the entire mtDNA. Using a heuristic algorithm particle swarm optimization (PSO, all the designed primers were computationally validated to fit the common primer design constraints, such as the melting temperature, primer length and GC content, PCR product length, secondary structure, specificity, and terminal limitation. The overlap requirement for PCR amplicons in the entire mtDNA is satisfied by defining the overlapping region with the sliding window technology. Finally, primer sets were designed within the overlapping region. The primer sets for the entire mtDNA sequences were successfully demonstrated in the example of two closely-related fish species. The pseudo code for the primer design algorithm is provided. CONCLUSIONS/SIGNIFICANCE: In conclusion, it can be said that our proposed sliding window-based PSO

  11. Comparison of C. elegans and C. briggsae genome sequences reveals extensive conservation of chromosome organization and synteny.

    Directory of Open Access Journals (Sweden)

    LaDeana W Hillier

    2007-07-01

    Full Text Available To determine whether the distinctive features of Caenorhabditis elegans chromosomal organization are shared with the C. briggsae genome, we constructed a single nucleotide polymorphism-based genetic map to order and orient the whole genome shotgun assembly along the six C. briggsae chromosomes. Although these species are of the same genus, their most recent common ancestor existed 80-110 million years ago, and thus they are more evolutionarily distant than, for example, human and mouse. We found that, like C. elegans chromosomes, C. briggsae chromosomes exhibit high levels of recombination on the arms along with higher repeat density, a higher fraction of intronic sequence, and a lower fraction of exonic sequence compared with chromosome centers. Despite extensive intrachromosomal rearrangements, 1:1 orthologs tend to remain in the same region of the chromosome, and colinear blocks of orthologs tend to be longer in chromosome centers compared with arms. More strikingly, the two species show an almost complete conservation of synteny, with 1:1 orthologs present on a single chromosome in one species also found on a single chromosome in the other. The conservation of both chromosomal organization and synteny between these two distantly related species suggests roles for chromosome organization in the fitness of an organism that are only poorly understood presently.

  12. Determination of 5 '-leader sequences from radically disparate strains of porcine reproductive and respiratory syndrome virus reveals the presence of highly conserved sequence motifs

    DEFF Research Database (Denmark)

    Oleksiewicz, M.B.; Bøtner, Anette; Nielsen, Jens

    1999-01-01

    We determined the untranslated 5'-leader sequence for three different isolates of porcine reproductive and respiratory syndrome virus (PRRSV): pathogenic European- and American-types, as well as an American-type vaccine strain. 5'-leader from European- and American-type PRRSV differed in length...... (220 and 190 nt, respectively), and exhibited only approximately 50% nucleotide homology. Nevertheless, highly conserved areas were identified in the leader of all 3 PRRSV isolates, which constitute candidate motifs for binding of protein(s) involved in viral replication. These comparative data provide...

  13. The interplay of sequence conservation and T cell immune recognition

    DEFF Research Database (Denmark)

    Bresciani, Anne Gøther; Sette, Alessandro; Greenbaum, Jason

    2014-01-01

    examined the hypothesis that conservation of a peptide in bacteria that are part of the healthy human microbiome leads to a reduced level of immunogenicity due to tolerization of T cells to the commensal bacteria. This was done by comparing experimentally characterized T cell epitope recognition data from...... the Immune Epitope Database with their conservation in the human microbiome. Indeed, we did see a lower immunogenicity for conserved peptides conserved. While many aspects how this conservation comparison is done require further optimization, this is a first step towards a better understanding T cell...... recognition of peptides in bacterial pathogens is influenced by their conservation in commensal bacteria. If the further work proves that this approach is successful, the degree of overlap of a peptide with the human proteome or microbiome could be added to the arsenal of tools available to assess peptide...

  14. Domain architecture conservation in orthologs

    Science.gov (United States)

    2011-01-01

    Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the

  15. 5' Region of the human interleukin 4 gene: structure and potential regulatory elements

    Energy Technology Data Exchange (ETDEWEB)

    Eder, A; Krafft-Czepa, H; Krammer, P H

    1988-01-25

    The lymphokine Interleukin 4 (IL-4) is secreted by antigen or mitogen activated T lymphocytes. IL-4 stimulates activation and differentiation of B lymphocytes and growth of T lymphocytes and mast cells. The authors isolated the human IL-4 gene from a lambda EMBL3 genomic library. As a probe they used a synthetic oligonucleotide spanning position 40 to 79 of the published IL-4 cDNA sequence. The 5' promoter region contains several sequence elements which may have a cis-acting regulatory function for IL-4 gene expression. These elements include a TATA-box, three CCAAT-elements (two are on the non-coding strand) and an octamer motif. A comparison of the 5' flanking region of the human murine IL-4 gene (4) shows that the region between position -306 and +44 is highly conserved (83% homology).

  16. LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

    Science.gov (United States)

    Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

    2014-02-17

    As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of

  17. Position-specific prediction of methylation sites from sequence conservation based on information theory.

    Science.gov (United States)

    Shi, Yinan; Guo, Yanzhi; Hu, Yayun; Li, Menglong

    2015-07-23

    Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation.

  18. Conservation and reclamation at Alberta's mineable oil sands

    Energy Technology Data Exchange (ETDEWEB)

    Purdy, B.; Richens, T. [Alberta Environment, Edmonton, AB (Canada)

    2010-07-01

    The regulatory foundation for oil sands in this region is established by the Energy Resources Conservation Board, Environmental Protection and Enhancement Act (EPEA), as well as the Water Act. This presentation discussed the regulatory foundation for conservation and reclamation in the mineable oil sands region. EPEA requirements and conservation objectives were identified. EPEA conservation and reclamation requirements stipulate that an operator must conserve and reclaim and obtain a reclamation certificate. EPEA approvals that were presented compared prescriptive standards versus meeting outcomes at certification. Operational and management challenges as well as the role of research networks and multi-stakeholder organizations were also addressed. Challenge facing the industry include progressive reclamation; tailings management and process-affected water; reclamation certification; integrated landscapes; soil handling and revegetation and monitoring and research. The presentation demonstrated that reclamation begins with mine planning and ends with certification. figs.

  19. Functional promoter upstream p53 regulatory sequence of IGFBP3 that is silenced by tumor specific methylation

    International Nuclear Information System (INIS)

    Hanafusa, Tadashi; Shinji, Toshiyuki; Shiraha, Hidenori; Nouso, Kazuhiro; Iwasaki, Yoshiaki; Yumoto, Eichiro; Ono, Toshiro; Koide, Norio

    2005-01-01

    Insulin-like growth factor binding protein (IGFBP)-3 functions as a carrier of insulin-like growth factors (IGFs) in circulation and a mediator of the growth suppression signal in cells. There are two reported p53 regulatory regions in the IGFBP3 gene; one upstream of the promoter and one intronic. We previously reported a hot spot of promoter hypermethylation of IGFBP-3 in human hepatocellular carcinomas and derivative cell lines. As the hot spot locates at the putative upstream p53 consensus sequences, these p53 consensus sequences are really functional is a question to be answered. In this study, we examined the p53 consensus sequences upstream of the IGFBP-3 promoter for the p53 induced expression of IGFBP-3. Deletion, mutagenesis, and methylation constructs of IGFBP-3 promoter were assessed in the human hepatoblastoma cell line HepG2 for promoter activity. Deletions and mutations of these sequences completely abolished the expression of IGFBP-3 in the presence of p53 overexpression. In vitro methylation of these p53 consensus sequences also suppressed IGFBP-3 expression. In contrast, the expression of IGFBP-3 was not affected in the absence of p53 overexpression. Further, we observed by electrophoresis mobility shift assay that p53 binding to the promoter region was diminished when methylated. From these observations, we conclude that four out of eleven p53 consensus sequences upstream of the IGFBP-3 promoter are essential for the p53 induced expression of IGFBP-3, and hypermethylation of these sequences selectively suppresses p53 induced IGFBP-3 expression in HepG2 cells

  20. Cis-acting regulatory sequences promote high-frequency gene conversion between repeated sequences in mammalian cells.

    Science.gov (United States)

    Raynard, Steven J; Baker, Mark D

    2004-01-01

    In mammalian cells, little is known about the nature of recombination-prone regions of the genome. Previously, we reported that the immunoglobulin heavy chain (IgH) mu locus behaved as a hotspot for mitotic, intrachromosomal gene conversion (GC) between repeated mu constant (Cmu) regions in mouse hybridoma cells. To investigate whether elements within the mu gene regulatory region were required for hotspot activity, gene targeting was used to delete a 9.1 kb segment encompassing the mu gene promoter (Pmu), enhancer (Emu) and switch region (Smu) from the locus. In these cell lines, GC between the Cmu repeats was significantly reduced, indicating that this 'recombination-enhancing sequence' (RES) is necessary for GC hotspot activity at the IgH locus. Importantly, the RES fragment stimulated GC when appended to the same Cmu repeats integrated at ectopic genomic sites. We also show that deletion of Emu and flanking matrix attachment regions (MARs) from the RES abolishes GC hotspot activity at the IgH locus. However, no stimulation of ectopic GC was observed with the Emu/MARs fragment alone. Finally, we provide evidence that no correlation exists between the level of transcription and GC promoted by the RES. We suggest a model whereby Emu/MARS enhances mitotic GC at the endogenous IgH mu locus by effecting chromatin modifications in adjacent DNA.

  1. Identification of microRNAs and their targets in Finger millet by high throughput sequencing.

    Science.gov (United States)

    Usha, S; Jyothi, M N; Sharadamma, N; Dixit, Rekha; Devaraj, V R; Nagesh Babu, R

    2015-12-15

    MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Human T-cell recognition of synthetic peptides representing conserved and variant sequences from the merozoite surface protein 2 of Plasmodium falciparum

    DEFF Research Database (Denmark)

    Theander, T G; Hviid, L; Dodoo, D

    1997-01-01

    Merozoite surface protein 2 (MSP2) is a malaria vaccine candidate currently undergoing clinical trials. We analyzed the peripheral blood mononuclear cell (PBMC) response to synthetic peptides corresponding to conserved and variant regions of the FCQ-27 allelic form of MSP2 in Ghanaian individuals....... The findings are encouraging for the development of a vaccine based on these T-epitope containing regions of MSP2, as the peptides were broadly recognized suggesting that they can bind to diverse HLA alleles and also because they include conserved MSP2 sequences. Immunisation with a vaccine construct...

  3. Identification of functional SNPs in the 5-prime flanking sequences of human genes

    Directory of Open Access Journals (Sweden)

    Lenhard Boris

    2005-02-01

    Full Text Available Abstract Background Over 4 million single nucleotide polymorphisms (SNPs are currently reported to exist within the human genome. Only a small fraction of these SNPs alter gene function or expression, and therefore might be associated with a cell phenotype. These functional SNPs are consequently important in understanding human health. Information related to functional SNPs in candidate disease genes is critical for cost effective genetic association studies, which attempt to understand the genetics of complex diseases like diabetes, Alzheimer's, etc. Robust methods for the identification of functional SNPs are therefore crucial. We report one such experimental approach. Results Sequence conserved between mouse and human genomes, within 5 kilobases of the 5-prime end of 176 GPCR genes, were screened for SNPs. Sequences flanking these SNPs were scored for transcription factor binding sites. Allelic pairs resulting in a significant score difference were predicted to influence the binding of transcription factors (TFs. Ten such SNPs were selected for mobility shift assays (EMSA, resulting in 7 of them exhibiting a reproducible shift. The full-length promoter regions with 4 of the 7 SNPs were cloned in a Luciferase based plasmid reporter system. Two out of the 4 SNPs exhibited differential promoter activity in several human cell lines. Conclusions We propose a method for effective selection of functional, regulatory SNPs that are located in evolutionary conserved 5-prime flanking regions (5'-FR regions of human genes and influence the activity of the transcriptional regulatory region. Some SNPs behave differently in different cell types.

  4. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    Science.gov (United States)

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  5. [Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

    Science.gov (United States)

    Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

    2013-10-01

    Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.

  6. Exploring the miRNA regulatory network using evolutionary correlations.

    Directory of Open Access Journals (Sweden)

    Benedikt Obermayer

    2014-10-01

    Full Text Available Post-transcriptional regulation by miRNAs is a widespread and highly conserved phenomenon in metazoans, with several hundreds to thousands of conserved binding sites for each miRNA, and up to two thirds of all genes under miRNA regulation. At the same time, the effect of miRNA regulation on mRNA and protein levels is usually quite modest and associated phenotypes are often weak or subtle. This has given rise to the notion that the highly interconnected miRNA regulatory network exerts its function less through any individual link and more via collective effects that lead to a functional interdependence of network links. We present a Bayesian framework to quantify conservation of miRNA target sites using vertebrate whole-genome alignments. The increased statistical power of our phylogenetic model allows detection of evolutionary correlation in the conservation patterns of site pairs. Such correlations could result from collective functions in the regulatory network. For instance, co-conservation of target site pairs supports a selective benefit of combinatorial regulation by multiple miRNAs. We find that some miRNA families are under pronounced co-targeting constraints, indicating a high connectivity in the regulatory network, while others appear to function in a more isolated way. By analyzing coordinated targeting of different curated gene sets, we observe distinct evolutionary signatures for protein complexes and signaling pathways that could reflect differences in control strategies. Our method is easily scalable to analyze upcoming larger data sets, and readily adaptable to detect high-level selective constraints between other genomic loci. We thus provide a proof-of-principle method to understand regulatory networks from an evolutionary perspective.

  7. A DNA-binding-site landscape and regulatory network analysis for NAC transcription factors in Arabidopsis thaliana

    DEFF Research Database (Denmark)

    Lindemose, Søren; Jensen, Michael Krogh; de Velde, Jan Van

    2014-01-01

    regulatory networks of 12 NAC transcription factors. Our data offer specific single-base resolution fingerprints for most TFs studied and indicate that NAC DNA-binding specificities might be predicted from their DNA-binding domain's sequence. The developed methodology, including the application......Target gene identification for transcription factors is a prerequisite for the systems wide understanding of organismal behaviour. NAM-ATAF1/2-CUC2 (NAC) transcription factors are amongst the largest transcription factor families in plants, yet limited data exist from unbiased approaches to resolve...... the DNA-binding preferences of individual members. Here, we present a TF-target gene identification workflow based on the integration of novel protein binding microarray data with gene expression and multi-species promoter sequence conservation to identify the DNA-binding specificities and the gene...

  8. Cytoplasmic protein binding to highly conserved sequences in the 3' untranslated region of mouse protamine 2 mRNA, a translationally regulated transcript of male germ cells

    International Nuclear Information System (INIS)

    Kwon, Y.K.; Hecht, N.B.

    1991-01-01

    The expression of the protamines, the predominant nuclear proteins of mammalian spermatozoa, is regulated translationally during male germ-cell development. The 3' untranslated region (UTR) of protamine 1 mRNA has been reported to control its time of translation. To understand the mechanisms controlling translation of the protamine mRNAs, we have sought to identify cis elements of the 3' UTR of protamine 2 mRNA that are recognized by cytoplasmic factors. From gel retardation assays, two sequence elements are shown to form specific RNA-protein complexes. Protein binding sites of the two complexes were determined by RNase T1 mapping, by blocking the putative binding sites with antisense oligonucleotides, and by competition assays. The sequences of these elements, located between nucleotides + 537 and + 572 in protamine 2 mRNA, are highly conserved among postmeiotic translationally regulated nuclear proteins of the mammalian testis. Two closely linked protein binding sites were detected. UV-crosslinking studies revealed that a protein of about 18 kDa binds to one of the conserved sequences. These data demonstrate specific protein binding to a highly conserved 3' UTR of translationally regulated testicular mRNA

  9. BlockLogo: Visualization of peptide and sequence motif conservation

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian

    2013-01-01

    BlockLogo is a web-server application for the visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, se...

  10. Identification of cis-regulatory sequences that activate transcription in the suspensor of plant embryos.

    Science.gov (United States)

    Kawashima, Tomokazu; Wang, Xingjun; Henry, Kelli F; Bi, Yuping; Weterings, Koen; Goldberg, Robert B

    2009-03-03

    Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the scarlet runner bean (Phaseolus coccineus) G564 gene to understand how genes are activated specifically within the suspensor during early embryo development. Previously, we showed that the G564 upstream region has a block of tandem repeats, which contain a conserved 10-bp motif (GAAAAG(C)/(T)GAA), and that deletion of these repeats results in a loss of suspensor transcription. Here, we use gain-of-function (GOF) experiments with transgenic globular-stage tobacco embryos to show that only 1 of the 5 tandem repeats is required to drive suspensor-specific transcription. Fine-scale deletion and scanning mutagenesis experiments with 1 tandem repeat uncovered a 54-bp region that contains all of the sequences required to activate transcription in the suspensor, including the 10-bp motif (GAAAAGCGAA) and a similar 10-bp-like motif (GAAAAACGAA). Site-directed mutagenesis and GOF experiments indicated that both the 10-bp and 10-bp-like motifs are necessary, but not sufficient to activate transcription in the suspensor, and that a sequence (TTGGT) between the 10-bp and the 10-bp-like motifs is also necessary for suspensor transcription. Together, these data identify sequences that are required to activate transcription in the suspensor of a plant embryo after fertilization.

  11. The effects of sequence and type of chemotherapy and radiation therapy on cosmesis and complications after breast conservation therapy

    International Nuclear Information System (INIS)

    Markiewicz, Deborah A.; Schultz, Delray J.; Haas, Jonathan A.; Harris, Eleanor E. R.; Fox, Kevin R.; Glick, John H.; Solin, Lawrence J.

    1996-01-01

    Purpose: Chemotherapy plays an increasingly important role in the treatment of both node-negative and node-positive breast cancer patients, but the optimal sequencing of chemotherapy and radiation therapy is not well established. The purpose of this study is to evaluate the interaction of sequence and type of chemotherapy and hormonal therapy given with radiation therapy on the cosmetic outcome and the incidence of complications of Stage I and II breast cancer patients treated with breast-conserving therapy. Methods and Materials: The records of 1053 Stage I and II breast cancer patients treated with curative intent with breast-conserving surgery, axillary dissection, and radiation therapy between 1977-1991 were reviewed. Median follow-up after treatment was 6.7 years. Two hundred fourteen patients received chemotherapy alone, 141 patients received hormonal therapy alone, 86 patients received both, and 612 patients received no adjuvant therapy. Patients who received chemotherapy ± hormonal therapy were grouped according to sequence of chemotherapy: (a) concurrent = concurrent chemotherapy with radiation therapy followed by chemotherapy; (b) sequential = radiation followed by chemotherapy or chemotherapy followed by radiation; and (c) sandwich = chemotherapy followed by concurrent chemotherapy and radiation followed by chemotherapy. Compared to node negative patients, node-positive patients more commonly received chemotherapy (77 vs. 9%, p < 0.0001) and/or hormonal therapy (40 vs. 14%, p < 0.0001). Among patients who received chemotherapy, the majority (243 patients) received concurrent chemotherapy and radiation therapy with two cycles of cytoxan and 5-fluorouracil (5-FU) administered during radiation followed by six cycles of chemotherapy with cytoxan, 5-fluorouracil and either methotrexate(CMF) or doxorubicin(CAF). For analysis of cosmesis, patients included were relapse free with 3 years minimum follow-up. Results: The use of chemotherapy had an adverse effect

  12. Sequence conservation between porcine and human LRRK2

    DEFF Research Database (Denmark)

    Larsen, Knud; Madsen, Lone Bruhn

    2009-01-01

     Leucine-rich repeat kinase 2 (LRRK2) is a member of the ROCO protein superfamily (Ras of complex proteins (Roc) with a C-terminal Roc domain). Mutations in the LRRK2 gene lead to autosomal dominant Parkinsonism. We have cloned the porcine LRRK2 cDNA in an attempt to characterize conserved...... and expression patterns are conserved across species. The porcine LRRK2 gene was mapped to chromosome 5q25. The results obtained suggest that the LRRK2 gene might be of particular interest in our attempt to generate a transgenic porcine model for Parkinson's disease...

  13. Genomic regulatory landscapes and chromosomal rearrangements

    DEFF Research Database (Denmark)

    Ladegaard, Elisabete L Engenheiro

    2008-01-01

    The main objectives of the PhD study are to identify and characterise chromosomal rearrangements within evolutionarily conserved regulatory landscapes around genes involved in the regulation of transcription and/or development (trans-dev genes). A frequent feature of trans-dev genes is that they ......The main objectives of the PhD study are to identify and characterise chromosomal rearrangements within evolutionarily conserved regulatory landscapes around genes involved in the regulation of transcription and/or development (trans-dev genes). A frequent feature of trans-dev genes...... the complex spatio-temporal expression of the associated trans-dev gene. Rare chromosomal breakpoints that disrupt the integrity of these regulatory landscapes may be used as a tool, not only to make genotype-phenotype associations, but also to link the associated phenotype with the position and tissue...... specificity of the individual CNEs. In this PhD study I have studied several chromosomal rearrangements with breakpoints in the vicinity of trans-dev genes. This included chromosomal rearrangements compatible with known phenotype-genotype associations (Rieger syndrome-PITX2, Mowat-Wilson syndrome-ZEB2...

  14. 78 FR 44275 - Semiannual Regulatory Agenda

    Science.gov (United States)

    2013-07-23

    ... Rights. National Park Service--Completed Actions Regulation Sequence No. Title Identifier No. 200 Winter.... Timetable: Action Date FR Cite NPRM 07/00/13 Final Action 05/00/14 Regulatory Flexibility Analysis Required...: Action Date FR Cite NPRM 10/00/14 Final Action 10/00/14 Regulatory Flexibility Analysis Required: Yes...

  15. Conservation and variability of dengue virus proteins: implications for vaccine design.

    Directory of Open Access Journals (Sweden)

    Asif M Khan

    2008-08-01

    Full Text Available Genetic variation and rapid evolution are hallmarks of RNA viruses, the result of high mutation rates in RNA replication and selection of mutants that enhance viral adaptation, including the escape from host immune responses. Variability is uneven across the genome because mutations resulting in a deleterious effect on viral fitness are restricted. RNA viruses are thus marked by protein sites permissive to multiple mutations and sites critical to viral structure-function that are evolutionarily robust and highly conserved. Identification and characterization of the historical dynamics of the conserved sites have relevance to multiple applications, including potential targets for diagnosis, and prophylactic and therapeutic purposes.We describe a large-scale identification and analysis of evolutionarily highly conserved amino acid sequences of the entire dengue virus (DENV proteome, with a focus on sequences of 9 amino acids or more, and thus immune-relevant as potential T-cell determinants. DENV protein sequence data were collected from the NCBI Entrez protein database in 2005 (9,512 sequences and again in 2007 (12,404 sequences. Forty-four (44 sequences (pan-DENV sequences, mainly those of nonstructural proteins and representing approximately 15% of the DENV polyprotein length, were identical in 80% or more of all recorded DENV sequences. Of these 44 sequences, 34 ( approximately 77% were present in >or=95% of sequences of each DENV type, and 27 ( approximately 61% were conserved in other Flaviviruses. The frequencies of variants of the pan-DENV sequences were low (0 to approximately 5%, as compared to variant frequencies of approximately 60 to approximately 85% in the non pan-DENV sequence regions. We further showed that the majority of the conserved sequences were immunologically relevant: 34 contained numerous predicted human leukocyte antigen (HLA supertype-restricted peptide sequences, and 26 contained T-cell determinants identified by

  16. Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation

    DEFF Research Database (Denmark)

    Czaban, Adrian; Sharma, Sapna; Byrne, Stephen

    2015-01-01

    species from the Lolium-Festuca complex, ranging from 52,166 to 72,133 transcripts per assembly. We have also predicted a set of proteins and validated it with a high-confidence protein database from three closely related species (H. vulgare, B. distachyon and O. sativa). We have obtained gene family...... clusters for the four species using OrthoMCL and analyzed their inferred phylogenetic relationships. Our results indicate that VRN2 is a candidate gene for differentiating vernalization and non-vernalization types in the Lolium-Festuca complex. Grouping of the gene families based on their BLAST identity...... enabled us to divide ortholog groups into those that are very conserved and those that are more evolutionarily relaxed. The ratio of the non-synonumous to synonymous substitutions enabled us to pinpoint protein sequences evolving in response to positive selection. These proteins may explain some...

  17. An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

    Science.gov (United States)

    Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

    2016-02-18

    The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through

  18. Genome Analysis of Conserved Dehydrin Motifs in Vascular Plants

    Directory of Open Access Journals (Sweden)

    Ahmad A. Malik

    2017-05-01

    Full Text Available Dehydrins, a large family of abiotic stress proteins, are defined by the presence of a mostly conserved motif known as the K-segment, and may also contain two other conserved motifs known as the Y-segment and S-segment. Using the dehydrin literature, we developed a sequence motif definition of the K-segment, which we used to create a large dataset of dehydrin sequences by searching the Pfam00257 dehydrin dataset and the Phytozome 10 sequences of vascular plants. A comprehensive analysis of these sequences reveals that lysine residues are highly conserved in the K-segment, while the amino acid type is often conserved at other positions. Despite the Y-segment name, the central tyrosine is somewhat conserved, but can be substituted with two other small aromatic amino acids (phenylalanine or histidine. The S-segment contains a series of serine residues, but in some proteins is also preceded by a conserved LHR sequence. In many dehydrins containing all three of these motifs the S-segment is linked to the K-segment by a GXGGRRKK motif (where X can be any amino acid, suggesting a functional linkage between these two motifs. An analysis of the sequences shows that the dehydrin architecture and several biochemical properties (isoelectric point, molecular mass, and hydrophobicity score are dependent on each other, and that some dehydrin architectures are overexpressed during certain abiotic stress, suggesting that they may be optimized for a specific abiotic stress while others are involved in all forms of dehydration stress (drought, cold, and salinity.

  19. The origins and evolutionary history of human non-coding RNA regulatory networks.

    Science.gov (United States)

    Sherafatian, Masih; Mowla, Seyed Javad

    2017-04-01

    The evolutionary history and origin of the regulatory function of animal non-coding RNAs are not well understood. Lack of conservation of long non-coding RNAs and small sizes of microRNAs has been major obstacles in their phylogenetic analysis. In this study, we tried to shed more light on the evolution of ncRNA regulatory networks by changing our phylogenetic strategy to focus on the evolutionary pattern of their protein coding targets. We used available target databases of miRNAs and lncRNAs to find their protein coding targets in human. We were able to recognize evolutionary hallmarks of ncRNA targets by phylostratigraphic analysis. We found the conventional 3'-UTR and lesser known 5'-UTR targets of miRNAs to be enriched at three consecutive phylostrata. Firstly, in eukaryata phylostratum corresponding to the emergence of miRNAs, our study revealed that miRNA targets function primarily in cell cycle processes. Moreover, the same overrepresentation of the targets observed in the next two consecutive phylostrata, opisthokonta and eumetazoa, corresponded to the expansion periods of miRNAs in animals evolution. Coding sequence targets of miRNAs showed a delayed rise at opisthokonta phylostratum, compared to the 3' and 5' UTR targets of miRNAs. LncRNA regulatory network was the latest to evolve at eumetazoa.

  20. Molecular dissection of a contiguous gene syndrome: Frequent submicroscopic deletions, evolutionarily conserved sequences, and a hypomethylated island in the Miller-Dieker chromosome region

    International Nuclear Information System (INIS)

    Ledbetter, D.H.; Ledbetter, S.A.; vanTuinen, P.

    1989-01-01

    The Miller-Dieker syndrome (MDS), composed of characteristic facial abnormalities and a severe neuronal migration disorder affecting the cerebral cortex, is caused by visible or submicroscopic deletions of chromosome band 17p13. Twelve anonymous DNA markers were tested against a panel of somatic cell hybrids containing 17p deletions from seven MDS patients. All patients, including three with normal karyotypes, are deleted for a variable set of 5-12 markers. Two highly polymorphic VNTR (variable number of tandem repeats) probes, YNZ22 and YNH37, are codeleted in all patients tested and make molecular diagnosis for this disorder feasible. By pulsed-field gel electrophoresis, YNZ22 and YNH37 were shown to be within 30 kilobases (kb) of each other. Cosmid clones containing both VNTR sequences were identified, and restriction mapping showed them to be 100 kb were completely deleted in all patients, providing a minimum estimate of the size of the MDS critical region. A hypomethylated island and evolutionarily conserved sequences were identified within this 100-kb region, indications of the presence of one or more expressed sequences potentially involved in the pathophysiology of this disorder. The conserved sequences were mapped to mouse chromosome 11 by using mouse-rat somatic cell hybrids, extending the remarkable homology between human chromosome 17 and mouse chromosome 11 by 30 centimorgans, into the 17p telomere region

  1. RTA, a candidate G protein-coupled receptor: Cloning, sequencing, and tissue distribution

    International Nuclear Information System (INIS)

    Ross, P.C.; Figler, R.A.; Corjay, M.H.; Barber, C.M.; Adam, N.; Harcus, D.R.; Lynch, K.R.

    1990-01-01

    Genomic and cDNA clones, encoding a protein that is a member of the guanine nucleotide-binding regulatory protein (G protein)-coupled receptor superfamily, were isolated by screening rat genomic and thoracic aorta cDNA libraries with an oligonucleotide encoding a highly conserved region of the M 1 muscarinic acetylcholine receptor. Sequence analyses of these clones showed that they encode a 343-amino acid protein (named RTA). The RTA gene is single copy, as demonstrated by restriction mapping and Southern blotting of genomic clones and rat genomic DNA. RTA RNA sequences are relatively abundant throughout the gut, vas deferens, uterus, and aorta but are only barely detectable (on Northern blots) in liver, kidney, lung, and salivary gland. In the rat brain, RTA sequences are markedly abundant in the cerebellum. TRA is most closely related to the mas oncogene (34% identity), which has been suggested to be a forebrain angiotensin receptor. They conclude that RTA is not an angiotensin receptor; to date, they have been unable to identify its ligand

  2. REDfly: a Regulatory Element Database for Drosophila.

    Science.gov (United States)

    Gallo, Steven M; Li, Long; Hu, Zihua; Halfon, Marc S

    2006-02-01

    Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution.

  3. RegRNA: an integrated web server for identifying regulatory RNA motifs and elements

    OpenAIRE

    Huang, Hsi-Yuan; Chien, Chia-Hung; Jen, Kuan-Hua; Huang, Hsien-Da

    2006-01-01

    Numerous regulatory structural motifs have been identified as playing essential roles in transcriptional and post-transcriptional regulation of gene expression. RegRNA is an integrated web server for identifying the homologs of regulatory RNA motifs and elements against an input mRNA sequence. Both sequence homologs and structural homologs of regulatory RNA motifs can be recognized. The regulatory RNA motifs supported in RegRNA are categorized into several classes: (i) motifs in mRNA 5′-untra...

  4. Further results on universal properties in conservative dynamical systems

    Energy Technology Data Exchange (ETDEWEB)

    Benettin, G [Padua Univ. (Italy). Ist. di Fisica; Galgani, L; Giorgilli, A [Milan Univ. (Italy). Ist. di Fisica; Milan Univ. (Italy). Ist. di Matematica)

    1980-10-11

    In conservative dynamical systems depending on a parameter, sequences of period-doubling bifurcations can be observed by varying the parameter, starting from a stable fixed point. These sequences are analogous to those already known for dissipative systems. The paper shows some new results obtained for two-dimensional conservative mappings.

  5. Biodiversity conservation and climate mitigation: What role can economic instruments play?

    NARCIS (Netherlands)

    Ring, I.; Drechsler, M.; Teeffelen, van A.J.A.; Irawan, S.; Venter, O.

    2010-01-01

    Tradable permits and intergovernmental fiscal transfers play an increasing role in both biodiversity conservation and climate mitigation. In comparison to regulatory and planning approaches these economic instruments offer a more flexible and cost-effective approach to biodiversity conservation.

  6. Delay-independent stability of genetic regulatory networks.

    Science.gov (United States)

    Wu, Fang-Xiang

    2011-11-01

    Genetic regulatory networks can be described by nonlinear differential equations with time delays. In this paper, we study both locally and globally delay-independent stability of genetic regulatory networks, taking messenger ribonucleic acid alternative splicing into consideration. Based on nonnegative matrix theory, we first develop necessary and sufficient conditions for locally delay-independent stability of genetic regulatory networks with multiple time delays. Compared to the previous results, these conditions are easy to verify. Then we develop sufficient conditions for global delay-independent stability for genetic regulatory networks. Compared to the previous results, this sufficient condition is less conservative. To illustrate theorems developed in this paper, we analyze delay-independent stability of two genetic regulatory networks: a real-life repressilatory network with three genes and three proteins, and a synthetic gene regulatory network with five genes and seven proteins. The simulation results show that the theorems developed in this paper can effectively determine the delay-independent stability of genetic regulatory networks.

  7. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    Science.gov (United States)

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  8. Identification of putative cis-regulatory elements in Cryptosporidium parvum by de novo pattern finding

    Directory of Open Access Journals (Sweden)

    Kissinger Jessica C

    2007-01-01

    Full Text Available Abstract Background Cryptosporidium parvum is a unicellular eukaryote in the phylum Apicomplexa. It is an obligate intracellular parasite that causes diarrhea and is a significant AIDS-related pathogen. Cryptosporidium parvum is not amenable to long-term laboratory cultivation or classical molecular genetic analysis. The parasite exhibits a complex life cycle, a broad host range, and fundamental mechanisms of gene regulation remain unknown. We have used data from the recently sequenced genome of this organism to uncover clues about gene regulation in C. parvum. We have applied two pattern finding algorithms MEME and AlignACE to identify conserved, over-represented motifs in the 5' upstream regions of genes in C. parvum. To support our findings, we have established comparative real-time -PCR expression profiles for the groups of genes examined computationally. Results We find that groups of genes that share a function or belong to a common pathway share upstream motifs. Different motifs are conserved upstream of different groups of genes. Comparative real-time PCR studies show co-expression of genes within each group (in sub-sets during the life cycle of the parasite, suggesting co-regulation of these genes may be driven by the use of conserved upstream motifs. Conclusion This is one of the first attempts to characterize cis-regulatory elements in the absence of any previously characterized elements and with very limited expression data (seven genes only. Using de novo pattern finding algorithms, we have identified specific DNA motifs that are conserved upstream of genes belonging to the same metabolic pathway or gene family. We have demonstrated the co-expression of these genes (often in subsets using comparative real-time-PCR experiments thus establishing evidence for these conserved motifs as putative cis-regulatory elements. Given the lack of prior information concerning expression patterns and organization of promoters in C. parvum we

  9. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation | Center for Cancer Research

    Science.gov (United States)

    Dubbed "Tom's T" by Dhruba Chattoraj, the unusually conserved thymine at position +7 in bacteriophage P1 plasmid RepA DNA binding sites rises above repressor and acceptor sequence logos. The T appears to represent base flipping prior to helix opening in this DNA replication initation protein.

  10. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  11. A novel method for in silico identification of regulatory SNPs in human genome.

    Science.gov (United States)

    Li, Rong; Zhong, Dexing; Liu, Ruiling; Lv, Hongqiang; Zhang, Xinman; Liu, Jun; Han, Jiuqiang

    2017-02-21

    Regulatory single nucleotide polymorphisms (rSNPs), kind of functional noncoding genetic variants, can affect gene expression in a regulatory way, and they are thought to be associated with increased susceptibilities to complex diseases. Here a novel computational approach to identify potential rSNPs is presented. Different from most other rSNPs finding methods which based on hypothesis that SNPs causing large allele-specific changes in transcription factor binding affinities are more likely to play regulatory functions, we use a set of documented experimentally verified rSNPs and nonfunctional background SNPs to train classifiers, so the discriminating features are found. To characterize variants, an extensive range of characteristics, such as sequence context, DNA structure and evolutionary conservation etc. are analyzed. Support vector machine is adopted to build the classifier model together with an ensemble method to deal with unbalanced data. 10-fold cross-validation result shows that our method can achieve accuracy with sensitivity of ~78% and specificity of ~82%. Furthermore, our method performances better than some other algorithms based on aforementioned hypothesis in handling false positives. The original data and the source matlab codes involved are available at https://sourceforge.net/projects/rsnppredict/. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Regulatory sequence of cupin family gene

    Science.gov (United States)

    Hood, Elizabeth; Teoh, Thomas

    2017-07-25

    This invention is in the field of plant biology and agriculture and relates to novel seed specific promoter regions. The present invention further provide methods of producing proteins and other products of interest and methods of controlling expression of nucleic acid sequences of interest using the seed specific promoter regions.

  13. Sequence analysis of the MYC oncogene involved in the t(8;14)(q24;q11) chromosome translocation in a human leukemia T-cell line indicates that putative regulatory regions are not altered

    International Nuclear Information System (INIS)

    Finver, S.N.; Nishikura, K.; Finger, L.R.; Haluska, F.G.; Finan, J.; Nowell, P.C.; Croce, C.M.

    1988-01-01

    The authors cloned the translocation-associated and homologous normal MYC alleles from SKW-3, a leukemia T-cell line with the t(8; 14)(q24; q11) translocation, and determined the sequence of the MYC oncogene first exon and flanking 5' putative regulatory regions. S1 nuclease protection experiments utilizing a MYC first exon probe demonstrated transcriptional deregulation of the MYC gene associated with the T-cell receptor α locus on the 8q + chromosome of SKW-3 cells. Nucleotide sequence analysis of the translocation-associated (8q +) MYC allele identified a single base substitution within the upstream flanking region; the homologous nontranslocated allele contained an additional substitution and a two-base deletion. None of the deletions or substitutions localized to putative 5' regulatory regions. The MYC first exon sequence was germ line in both alleles. These results demonstrate that alterations within the putative 5' MYC regulatory regions are not necessarily involved in MYC deregulation in T-cell leukemias, and they show that juxtaposition of the T-cell receptor α locus to a germ-line MYC oncogene results in MYC deregulation

  14. The upstream regulatory sequence of the light harvesting complex Lhcf2 gene of the marine diatom Phaeodactylum tricornutum enhances transcription in an orientation- and distance-independent fashion.

    Science.gov (United States)

    Russo, Monia Teresa; Annunziata, Rossella; Sanges, Remo; Ferrante, Maria Immacolata; Falciatore, Angela

    2015-12-01

    Diatoms are a key phytoplankton group in the contemporary ocean, showing extraordinary adaptation capacities to rapidly changing environments. The recent availability of whole genome sequences from representative species has revealed distinct features in their genomes, like novel combinations of genes encoding distinct metabolisms and a significant number of diatom-specific genes. However, the regulatory mechanisms driving diatom gene expression are still largely uncharacterized. Considering the wide variety of fields of study orbiting diatoms, ranging from ecology, evolutionary biology to biotechnology, it is thus essential to increase our understanding of fundamental gene regulatory processes such as transcriptional regulation. To this aim, we explored the functional properties of the 5'-flanking region of the Phaeodatylum tricornutum Lhcf2 gene, encoding a member of the Light Harvesting Complex superfamily and we showed that this region enhances transcription of a GUS reporter gene in an orientation- and distance-independent fashion. This represents the first example of a cis-regulatory sequence with enhancer-like features discovered in diatoms and it is instrumental for the generation of novel genetic tools and diatom exploitation in different areas of study. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Identification of direct regulatory targets of the transcription factor Sox10 based on function and conservation

    Directory of Open Access Journals (Sweden)

    Lee Sanghyuk

    2008-09-01

    Full Text Available Abstract Background Sox10, a member of the Sry-related HMG-Box gene family, is a critical transcription factor for several important cell lineages, most notably the neural crest stem cells and the derivative peripheral glial cells and melanocytes. Thus far, only a handful of direct target genes are known for this transcription factor limiting our understanding of the biological network it governs. Results We describe identification of multiple direct regulatory target genes of Sox10 through a procedure based on function and conservation. By combining RNA interference technique and DNA microarray technology, we have identified a set of genes that show significant down-regulation upon introduction of Sox10 specific siRNA into Schwannoma cells. Subsequent comparative genomics analyses led to potential binding sites for Sox10 protein conserved across several mammalian species within the genomic region proximal to these genes. Multiple sites belonging to 4 different genes (proteolipid protein, Sox10, extracellular superoxide dismutase, and pleiotrophin were shown to directly interact with Sox10 by chromatin immunoprecipitation assay. We further confirmed the direct regulation through the identified cis-element for one of the genes, extracellular superoxide dismutase, using electrophoretic mobility shift assay and reporter assay. Conclusion In sum, the process of combining differential expression profiling and comparative genomics successfully led to further defining the role of Sox10, a critical transcription factor for the development of peripheral glia. Our strategy utilizing relatively accessible techniques and tools should be applicable to studying the function of other transcription factors.

  16. Two estrogen response element sequences near the PCNA gene are not responsible for its estrogen-enhanced expression in MCF7 cells.

    Science.gov (United States)

    Wang, Cheng; Yu, Jie; Kallen, Caleb B

    2008-01-01

    The proliferating cell nuclear antigen (PCNA) is an essential component of DNA replication, cell cycle regulation, and epigenetic inheritance. High expression of PCNA is associated with poor prognosis in patients with breast cancer. The 5'-region of the PCNA gene contains two computationally-detected estrogen response element (ERE) sequences, one of which is evolutionarily conserved. Both of these sequences are of undocumented cis-regulatory function. We recently demonstrated that estradiol (E2) enhances PCNA mRNA expression in MCF7 breast cancer cells. MCF7 cells proliferate in response to E2. Here, we demonstrate that E2 rapidly enhanced PCNA mRNA and protein expression in a process that requires ERalpha as well as de novo protein synthesis. One of the two upstream ERE sequences was specifically bound by ERalpha-containing protein complexes, in vitro, in gel shift analysis. Yet, each ERE sequence, when cloned as a single copy, or when engineered as two tandem copies of the ERE-containing sequence, was not capable of activating a luciferase reporter construct in response to E2. In MCF7 cells, neither ERE-containing genomic region demonstrated E2-dependent recruitment of ERalpha by sensitive ChIP-PCR assays. We conclude that E2 enhances PCNA gene expression by an indirect process and that computational detection of EREs, even when evolutionarily conserved and when near E2-responsive genes, requires biochemical validation.

  17. Sequence and Expression Analysis of Interferon Regulatory Factor 10 (IRF10 in Three Diverse Teleost Fish Reveals Its Role in Antiviral Defense.

    Directory of Open Access Journals (Sweden)

    Qiaoqing Xu

    Full Text Available Interferon regulatory factor (IRF 10 was first found in birds and is present in the genome of other tetrapods (but not humans and mice, as well as in teleost fish. The functional role of IRF10 in vertebrate immunity is relatively unknown compared to IRF1-9. The target of this research was to clone and characterize the IRF10 genes in three economically important fish species that will facilitate future evaluation of this molecule in fish innate and adaptive immunity.In the present study, a single IRF10 gene was cloned in grass carp Ctenopharyngodon idella and Asian swamp eel Monopterus albus, and two, named IRF10a and IRF10b, in rainbow trout Oncorhynchus mykiss. The fish IRF10 molecules share highest identities to other vertebrate IRF10s, and have a well conserved DNA binding domain, IRF-associated domain, and an 8 exon/7 intron structure with conserved intron phase. The presence of an upstream ATG or open reading frame (ORF in the 5'-untranslated region of different fish IRF10 cDNA sequences suggests potential regulation at the translational level, and this has been verified by in vitro transcription/translation experiments of the trout IRF10a cDNA, but would still need to be validated in fish cells.Both trout IRF10 paralogues are highly expressed in thymus, blood and spleen but are relatively low in head kidney and caudal kidney. Trout IRF10b expression is significantly higher than IRF10a in integumentary tissues i.e. gills, scales, skin, intestine, adipose fin and tail fins, suggesting that IRF10b may be more important in mucosal immunity. The expression of both trout IRF10 paralogues is up-regulated by recombinant IFN-γ. The expression of the IRF10 genes is highly induced by Poly I:C in vitro and in vivo, and by viral infection, but is less responsive to peptidoglycan and bacterial infection, suggesting an important role of fish IRF10 in antiviral defense.

  18. 77 FR 58022 - Montana Regulatory Program

    Science.gov (United States)

    2012-09-19

    ... precludes in situ gasification projects from including carbon capture and sequestration (CCS) under the... Conservation as the regulatory authority for CCS activities within the State. SB498 generally established that..., the Board would regulate any proposed CCS activities appropriately. CCS operations have potential...

  19. Conserved genomic organisation of Group B Sox genes in insects.

    Directory of Open Access Journals (Sweden)

    Woerfel Gertrud

    2005-05-01

    Full Text Available Abstract Background Sox domain containing genes are important metazoan transcriptional regulators implicated in a wide rage of developmental processes. The vertebrate B subgroup contains the Sox1, Sox2 and Sox3 genes that have early functions in neural development. Previous studies show that Drosophila Group B genes have been functionally conserved since they play essential roles in early neural specification and mutations in the Drosophila Dichaete and SoxN genes can be rescued with mammalian Sox genes. Despite their importance, the extent and organisation of the Group B family in Drosophila has not been fully characterised, an important step in using Drosophila to examine conserved aspects of Group B Sox gene function. Results We have used the directed cDNA sequencing along with the output from the publicly-available genome sequencing projects to examine the structure of Group B Sox domain genes in Drosophila melanogaster, Drosophila pseudoobscura, Anopheles gambiae and Apis mellifora. All of the insect genomes contain four genes encoding Group B proteins, two of which are intronless, as is the case with vertebrate group B genes. As has been previously reported and unusually for Group B genes, two of the insect group B genes, Sox21a and Sox21b, contain introns within their DNA-binding domains. We find that the highly unusual multi-exon structure of the Sox21b gene is common to the insects. In addition, we find that three of the group B Sox genes are organised in a linked cluster in the insect genomes. By in situ hybridisation we show that the pattern of expression of each of the four group B genes during embryogenesis is conserved between D. melanogaster and D. pseudoobscura. Conclusion The DNA-binding domain sequences and genomic organisation of the group B genes have been conserved over 300 My of evolution since the last common ancestor of the Hymenoptera and the Diptera. Our analysis suggests insects have two Group B1 genes, SoxN and

  20. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  1. Identification of Conserved and Novel MicroRNAs in Blueberry

    Directory of Open Access Journals (Sweden)

    Junyang Yue

    2017-06-01

    Full Text Available MicroRNAs (miRNAs are a class of small endogenous RNAs that play important regulatory roles in cells by negatively affecting gene expression at both transcriptional and post-transcriptional levels. There have been extensive studies aiming to identify miRNAs and to elucidate their functions in various plant species. In the present study, we employed the high-throughput sequencing technology to profile miRNAs in blueberry fruits. A total of 9,992,446 small RNA tags with sizes ranged from 18 to 30 nt were obtained, indicating that blueberry fruits have a large and diverse small RNA population. Bioinformatic analysis identified 412 conserved miRNAs belonging to 29 families, and 35 predicted novel miRNAs that are likely to be unique to blueberries. Among them, expression profiles of five conserved miRNAs were validated by stem loop qRT-PCR. Furthermore, the potential target genes of conserved and novel miRNAs were predicted and subjected to Gene Ontology (GO annotation. Enrichment analysis of the GO-represented biological processes and molecular functions revealed that these target genes were potentially involved in a wide range of metabolic pathways and developmental processes. Particularly, anthocyanin biosynthesis has been predicted to be directly or indirectly regulated by diverse miRNA families. This study is the first report on genome-wide miRNA profile analysis in blueberry and it provides a useful resource for further elucidation of the functional roles of miRNAs during fruit development and ripening.

  2. Alu-mediated deletion of SOX10 regulatory elements in Waardenburg syndrome type 4.

    Science.gov (United States)

    Bondurand, Nadége; Fouquet, Virginie; Baral, Viviane; Lecerf, Laure; Loundon, Natalie; Goossens, Michel; Duriez, Benedicte; Labrune, Philippe; Pingault, Veronique

    2012-09-01

    Waardenburg syndrome type 4 (WS4) is a rare neural crest disorder defined by the combination of Waardenburg syndrome (sensorineural hearing loss and pigmentation defects) and Hirschsprung disease (intestinal aganglionosis). Three genes are known to be involved in this syndrome, that is, EDN3 (endothelin-3), EDNRB (endothelin receptor type B), and SOX10. However, 15-35% of WS4 remains unexplained at the molecular level, suggesting that other genes could be involved and/or that mutations within known genes may have escaped previous screenings. Here, we searched for deletions within recently identified SOX10 regulatory sequences and describe the first characterization of a WS4 patient presenting with a large deletion encompassing three of these enhancers. Analysis of the breakpoint region suggests a complex rearrangement involving three Alu sequences that could be mediated by a FosTes/MMBIR replication mechanism. Taken together with recent reports, our results demonstrate that the disruption of highly conserved non-coding elements located within or at a long distance from the coding sequences of key genes can result in several neurocristopathies. This opens up new routes to the molecular dissection of neural crest disorders.

  3. Rapid sequence divergence rates in the 5 prime regulatory regions of young Drosophila melanogaster duplicate gene pairs

    Directory of Open Access Journals (Sweden)

    Michael H. Kohn

    2008-01-01

    Full Text Available While it remains a matter of some debate, rapid sequence evolution of the coding sequences of duplicate genes is characteristic for early phases past duplication, but long established duplicates generally evolve under constraint, much like the rest of the coding genome. As for coding sequences, it may be possible to infer evolutionary rate, selection, and constraint via contrasts between duplicate gene divergence in the 5 prime regions and in the corresponding synonymous site divergence in the coding regions. Finding elevated rates for the 5 prime regions of duplicated genes, in addition to the coding regions, would enable statements regarding the early processes of duplicate gene evolution. Here, 1 kb of each of the 5 prime regulatory regions of Drosophila melanogaster duplicate gene pairs were mapped onto one another to isolate shared sequence blocks. Genetic distances within shared sequence blocks (d5’ were found to increase as a function of synonymous (dS, and to a lesser extend, amino-acid (dA site divergence between duplicates. The rate d5’/dS was found to rapidly decay from values > 1 in young duplicate pairs (dS 0.8. Such rapid rates of 5 prime evolution exceeding 1 (~neutral predominantly were found to occur in duplicate pairs with low amino-acid site divergence and that tended to be co-regulated when assayed on microarrays. Conceivably, functional redundancy and relaxation of selective constraint facilitates subsequent positive selection on the 5 prime regions of young duplicate genes. This might promote the evolution of new functions (neofunctionalization or division of labor among duplicate genes (subfunctionalization. In contrast, similar to the vast portion of the non-coding genome, the 5 prime regions of long-established gene duplicates appear to evolve under selective constraint, indicating that these long-established gene duplicates have assumed critical functions.

  4. Conservation of lipid metabolic gene transcriptional regulatory networks in fish and mammals.

    Science.gov (United States)

    Carmona-Antoñanzas, Greta; Tocher, Douglas R; Martinez-Rubio, Laura; Leaver, Michael J

    2014-01-15

    Lipid content and composition in aquafeeds have changed rapidly as a result of the recent drive to replace ecologically limited marine ingredients, fishmeal and fish oil (FO). Terrestrial plant products are the most economic and sustainable alternative; however, plant meals and oils are devoid of physiologically important cholesterol and long-chain polyunsaturated fatty acids (LC-PUFA), eicosapentaenoic (EPA), docosahexaenoic (DHA) and arachidonic (ARA) acids. Although replacement of dietary FO with vegetable oil (VO) has little effect on growth in Atlantic salmon (Salmo salar), several studies have shown major effects on the activity and expression of genes involved in lipid homeostasis. In vertebrates, sterols and LC-PUFA play crucial roles in lipid metabolism by direct interaction with lipid-sensing transcription factors (TFs) and consequent regulation of target genes. The primary aim of the present study was to elucidate the role of key TFs in the transcriptional regulation of lipid metabolism in fish by transfection and overexpression of TFs. The results show that the expression of genes of LC-PUFA biosynthesis (elovl and fads2) and cholesterol metabolism (abca1) are regulated by Lxr and Srebp TFs in salmon, indicating highly conserved regulatory mechanism across vertebrates. In addition, srebp1 and srebp2 mRNA respond to replacement of dietary FO with VO. Thus, Atlantic salmon adjust lipid metabolism in response to dietary lipid composition through the transcriptional regulation of gene expression. It may be possible to further increase efficient and effective use of sustainable alternatives to marine products in aquaculture by considering these important molecular interactions when formulating diets. © 2013.

  5. A New Approach to Sequence Analysis Exemplified by Identification of cis-Elements in Abscisic Acid Inducible Promoters

    DEFF Research Database (Denmark)

    Busk, Peter Kamp; Hallin, Peter Fischer; Salomon, Jesper

    -regulatory elements. We have developed a method for identifying short, conserved motifs in biological sequences such as proteins, DNA and RNA5. This method was used for analysis of approximately 2000 Arabidopsis thaliana promoters that have been shown by DNA array analysis to be induced by abscisic acid6....... These promoters were compared to 28000 promoters that are not induced by abscisic acid. The analysis identified previously described ABA-inducible promoter elements such as ABRE, CE3 and CRT1 but also new cis-elements were found. Furthermore, the list of DNA elements could be used to predict ABA...

  6. Combinatorial binding in human and mouse embryonic stem cells identifies conserved enhancers active in early embryonic development.

    Directory of Open Access Journals (Sweden)

    Jonathan Göke

    2011-12-01

    Full Text Available Transcription factors are proteins that regulate gene expression by binding to cis-regulatory sequences such as promoters and enhancers. In embryonic stem (ES cells, binding of the transcription factors OCT4, SOX2 and NANOG is essential to maintain the capacity of the cells to differentiate into any cell type of the developing embryo. It is known that transcription factors interact to regulate gene expression. In this study we show that combinatorial binding is strongly associated with co-localization of the transcriptional co-activator Mediator, H3K27ac and increased expression of nearby genes in embryonic stem cells. We observe that the same loci bound by Oct4, Nanog and Sox2 in ES cells frequently drive expression in early embryonic development. Comparison of mouse and human ES cells shows that less than 5% of individual binding events for OCT4, SOX2 and NANOG are shared between species. In contrast, about 15% of combinatorial binding events and even between 53% and 63% of combinatorial binding events at enhancers active in early development are conserved. Our analysis suggests that the combination of OCT4, SOX2 and NANOG binding is critical for transcription in ES cells and likely plays an important role for embryogenesis by binding at conserved early developmental enhancers. Our data suggests that the fast evolutionary rewiring of regulatory networks mainly affects individual binding events, whereas "gene regulatory hotspots" which are bound by multiple factors and active in multiple tissues throughout early development are under stronger evolutionary constraints.

  7. Characterization of novel precursor miRNAs using next generation sequencing and prediction of miRNA targets in Atlantic halibut.

    Directory of Open Access Journals (Sweden)

    Teshome Tilahun Bizuayehu

    Full Text Available BACKGROUND: microRNAs (miRNAs are implicated in regulation of many cellular processes. miRNAs are processed to their mature functional form in a step-wise manner by multiple proteins and cofactors in the nucleus and cytoplasm. Many miRNAs are conserved across vertebrates. Mature miRNAs have recently been characterized in Atlantic halibut (Hippoglossus hippoglossus L.. The aim of this study was to identify and characterize precursor miRNA (pre-miRNAs and miRNA targets in this non-model flatfish. Discovery of miRNA precursor forms and targets in non-model organisms is difficult because of limited source information available. Therefore, we have developed a methodology to overcome this limitation. METHODS: Genomic DNA and small transcriptome of Atlantic halibut were sequenced using Roche 454 pyrosequencing and SOLiD next generation sequencing (NGS, respectively. Identified pre- miRNAs were further validated with reverse-transcription PCR. miRNA targets were identified using miRanda and RNAhybrid target prediction tools using sequences from public databases. Some of miRNA targets were also identified using RACE-PCR. miRNA binding sites were validated with luciferase assay using the RTS34st cell line. RESULTS: We obtained more than 1.3 M and 92 M sequence reads from 454 genomic DNA sequencing and SOLiD small RNA sequencing, respectively. We identified 34 known and 9 novel pre-miRNAs. We predicted a number of miRNA target genes involved in various biological pathways. miR-24 binding to kisspeptin 1 receptor-2 (kiss1-r2 was confirmed using luciferase assay. CONCLUSION: This study demonstrates that identification of conserved and novel pre-miRNAs in a non-model vertebrate lacking substantial genomic resources can be performed by combining different next generation sequencing technologies. Our results indicate a wide conservation of miRNA precursors and involvement of miRNA in multiple regulatory pathways, and provide resources for further research on mi

  8. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    Energy Technology Data Exchange (ETDEWEB)

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.

  9. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

    Directory of Open Access Journals (Sweden)

    Junhua Li

    2014-01-01

    Full Text Available Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes.

  10. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    Science.gov (United States)

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining c

  11. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  12. Local synteny and codon usage contribute to asymmetric sequence divergence of Saccharomyces cerevisiae gene duplicates

    Directory of Open Access Journals (Sweden)

    Bergthorsson Ulfar

    2011-09-01

    Full Text Available Abstract Background Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD event (ohnologs versus small-scale duplications (SSD to determine if there exist any differences in their patterns of sequence evolution. Results For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression. Conclusions Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.

  13. In Silico Analysis of Gene Expression Network Components Underlying Pigmentation Phenotypes in the Python Identified Evolutionarily Conserved Clusters of Transcription Factor Binding Sites

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1 that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.

  14. Conservation of HIV-1 T cell epitopes across time and clades

    DEFF Research Database (Denmark)

    Levitz, Lauren; Koita, Ousmane A; Sangare, Kotou

    2012-01-01

    HIV genomic sequence variability has complicated efforts to generate an effective globally relevant vaccine. Regions of the viral genome conserved in sequence and across time may represent the "Achilles' heel" of HIV. In this study, highly conserved T-cell epitopes were selected using immunoinfor...

  15. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Science.gov (United States)

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  16. CONREAL web server: identification and visualization of conserved transcription factor binding sites

    NARCIS (Netherlands)

    Berezikov, E.; Guryev, V.; Cuppen, E.

    2005-01-01

    The use of orthologous sequences and phylogenetic footprinting approaches have become popular for the recognition of conserved and potentially functional sequences. Several algorithms have been developed for the identification of conserved transcription factor binding sites (TFBSs), which are

  17. AKAP18:PKA-RIIα structure reveals crucial anchor points for recognition of regulatory subunits of PKA.

    Science.gov (United States)

    Götz, Frank; Roske, Yvette; Schulz, Maike Svenja; Autenrieth, Karolin; Bertinetti, Daniela; Faelber, Katja; Zühlke, Kerstin; Kreuchwig, Annika; Kennedy, Eileen J; Krause, Gerd; Daumke, Oliver; Herberg, Friedrich W; Heinemann, Udo; Klussmann, Enno

    2016-07-01

    A-kinase anchoring proteins (AKAPs) interact with the dimerization/docking (D/D) domains of regulatory subunits of the ubiquitous protein kinase A (PKA). AKAPs tether PKA to defined cellular compartments establishing distinct pools to increase the specificity of PKA signalling. Here, we elucidated the structure of an extended PKA-binding domain of AKAP18β bound to the D/D domain of the regulatory RIIα subunits of PKA. We identified three hydrophilic anchor points in AKAP18β outside the core PKA-binding domain, which mediate contacts with the D/D domain. Such anchor points are conserved within AKAPs that bind regulatory RII subunits of PKA. We derived a different set of anchor points in AKAPs binding regulatory RI subunits of PKA. In vitro and cell-based experiments confirm the relevance of these sites for the interaction of RII subunits with AKAP18 and of RI subunits with the RI-specific smAKAP. Thus we report a novel mechanism governing interactions of AKAPs with PKA. The sequence specificity of each AKAP around the anchor points and the requirement of these points for the tight binding of PKA allow the development of selective inhibitors to unequivocally ascribe cellular functions to the AKAP18-PKA and other AKAP-PKA interactions. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.

  18. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    International Nuclear Information System (INIS)

    Aris, J.P.; Blobel, G.

    1991-01-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is ∼1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain ∼75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis

  19. Deletions Involving Long-Range Conserved Nongenic Sequences Upstream and Downstream of FOXL2 as a Novel Disease-Causing Mechanism in Blepharophimosis Syndrome

    OpenAIRE

    Beysen, D.; Raes, J.; Leroy, B. P.; Lucassen, A.; Yates, J. R. W.; Clayton-Smith, J.; Ilyina, H.; Brooks, S. Sklower; Christin-Maitre, S.; Fellous, M.; Fryns, J. P.; Kim, J. R.; Lapunzina, P.; Lemyre, E.; Meire, F.

    2005-01-01

    The expression of a gene requires not only a normal coding sequence but also intact regulatory regions, which can be located at large distances from the target genes, as demonstrated for an increasing number of developmental genes. In previous mutation studies of the role of FOXL2 in blepharophimosis syndrome (BPES), we identified intragenic mutations in 70% of our patients. Three translocation breakpoints upstream of FOXL2 in patients with BPES suggested a position effect. Here, we identifie...

  20. Extensive evolutionary changes in regulatory element activity during human origins are associated with altered gene expression and positive selection.

    Directory of Open Access Journals (Sweden)

    Yoichiro Shibata

    2012-06-01

    Full Text Available Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species.

  1. Biodiversity conservation in a changing climate: a review of threats and implications for conservation planning in Myanmar.

    Science.gov (United States)

    Rao, Madhu; Saw Htun; Platt, Steven G; Tizard, Robert; Poole, Colin; Than Myint; Watson, James E M

    2013-11-01

    High levels of species richness and endemism make Myanmar a regional priority for conservation. However, decades of economic and political sanctions have resulted in low conservation investment to effectively tackle threats to biodiversity. Recent sweeping political reforms have placed Myanmar on the fast track to economic development-the expectation is increased economic investments focused on the exploitation of the country's rich, and relatively intact, natural resources. Within a context of weak regulatory capacity and inadequate environmental safeguards, rapid economic development is likely to have far-reaching negative implications for already threatened biodiversity and natural-resource-dependent human communities. Climate change will further exacerbate prevailing threats given Myanmar's high exposure and vulnerability. The aim of this review is to examine the implications of increased economic growth and a changing climate within the larger context of biodiversity conservation in Myanmar. We summarize conservation challenges, assess direct climatological impacts on biodiversity and conclude with recommendations for long-term adaptation approaches for biodiversity conservation.

  2. Two estrogen response element sequences near the PCNA gene are not responsible for its estrogen-enhanced expression in MCF7 cells.

    Directory of Open Access Journals (Sweden)

    Cheng Wang

    Full Text Available The proliferating cell nuclear antigen (PCNA is an essential component of DNA replication, cell cycle regulation, and epigenetic inheritance. High expression of PCNA is associated with poor prognosis in patients with breast cancer. The 5'-region of the PCNA gene contains two computationally-detected estrogen response element (ERE sequences, one of which is evolutionarily conserved. Both of these sequences are of undocumented cis-regulatory function. We recently demonstrated that estradiol (E2 enhances PCNA mRNA expression in MCF7 breast cancer cells. MCF7 cells proliferate in response to E2.Here, we demonstrate that E2 rapidly enhanced PCNA mRNA and protein expression in a process that requires ERalpha as well as de novo protein synthesis. One of the two upstream ERE sequences was specifically bound by ERalpha-containing protein complexes, in vitro, in gel shift analysis. Yet, each ERE sequence, when cloned as a single copy, or when engineered as two tandem copies of the ERE-containing sequence, was not capable of activating a luciferase reporter construct in response to E2. In MCF7 cells, neither ERE-containing genomic region demonstrated E2-dependent recruitment of ERalpha by sensitive ChIP-PCR assays.We conclude that E2 enhances PCNA gene expression by an indirect process and that computational detection of EREs, even when evolutionarily conserved and when near E2-responsive genes, requires biochemical validation.

  3. Cis-regulatory control of the nuclear receptor Coup-TF gene in the sea urchin Paracentrotus lividus embryo.

    Directory of Open Access Journals (Sweden)

    Lamprini G Kalampoki

    Full Text Available Coup-TF, an orphan member of the nuclear receptor super family, has a fundamental role in the development of metazoan embryos. The study of the gene's regulatory circuit in the sea urchin embryo will facilitate the placement of this transcription factor in the well-studied embryonic Gene Regulatory Network (GRN. The Paracentrotus lividus Coup-TF gene (PlCoup-TF is expressed throughout embryonic development preferentially in the oral ectoderm of the gastrula and the ciliary band of the pluteus stage. Two overlapping λ genomic clones, containing three exons and upstream sequences of PlCoup-TF, were isolated from a genomic library. The transcription initiation site was determined and 5' deletions and individual segments of a 1930 bp upstream region were placed ahead of a GFP reporter cassette and injected into fertilized P.lividus eggs. Module a (-532 to -232, was necessary and sufficient to confer ciliary band expression to the reporter. Comparison of P.lividus and Strongylocentrotus purpuratus upstream Coup-TF sequences, revealed considerable conservation, but none within module a. 5' and internal deletions into module a, defined a smaller region that confers ciliary band specific expression. Putative regulatory cis-acting elements (RE1, RE2 and RE3 within module a, were specifically bound by proteins in sea urchin embryonic nuclear extracts. Site-specific mutagenesis of these elements resulted in loss of reporter activity (RE1 or ectopic expression (RE2, RE3. It is proposed that sea urchin transcription factors, which bind these three regulatory sites, are necessary for spatial and quantitative regulation of the PlCoup-TF gene at pluteus stage sea urchin embryos. These findings lead to the future identification of these factors and to the hierarchical positioning of PlCoup-TF within the embryonic GRN.

  4. Exploration of noncoding sequences in metagenomes.

    Directory of Open Access Journals (Sweden)

    Fabián Tobar-Tosse

    Full Text Available Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C content, Codon Usage (Cd, Trinucleotide Usage (Tn, and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.

  5. Uniform, optimal signal processing of mapped deep-sequencing data.

    Science.gov (United States)

    Kumar, Vibhor; Muratani, Masafumi; Rayan, Nirmala Arul; Kraus, Petra; Lufkin, Thomas; Ng, Huck Hui; Prabhakar, Shyam

    2013-07-01

    Despite their apparent diversity, many problems in the analysis of high-throughput sequencing data are merely special cases of two general problems, signal detection and signal estimation. Here we adapt formally optimal solutions from signal processing theory to analyze signals of DNA sequence reads mapped to a genome. We describe DFilter, a detection algorithm that identifies regulatory features in ChIP-seq, DNase-seq and FAIRE-seq data more accurately than assay-specific algorithms. We also describe EFilter, an estimation algorithm that accurately predicts mRNA levels from as few as 1-2 histone profiles (R ∼0.9). Notably, the presence of regulatory motifs in promoters correlates more with histone modifications than with mRNA levels, suggesting that histone profiles are more predictive of cis-regulatory mechanisms. We show by applying DFilter and EFilter to embryonic forebrain ChIP-seq data that regulatory protein identification and functional annotation are feasible despite tissue heterogeneity. The mathematical formalism underlying our tools facilitates integrative analysis of data from virtually any sequencing-based functional profile.

  6. RNA SURVEILLANCE– AN EMERGING ROLE FOR RNA REGULATORY NETWORKS IN AGING

    OpenAIRE

    Montano, Monty; Long, Kimberly

    2010-01-01

    In this review, we describe recent advances in the field of RNA regulatory biology and relate these advances to aging science. We introduce a new term, RNA surveillance, an RNA regulatory process that is conserved in metazoans, and describe how RNA surveillance represents molecular cross-talk between two emerging RNA regulatory systems – RNA interference and RNA editing. We discuss how RNA surveillance mechanisms influence mRNA and microRNA expression and activity during lifespan. Additionall...

  7. Enrichment of conserved synaptic activity-responsive element in neuronal genes predicts a coordinated response of MEF2, CREB and SRF.

    Directory of Open Access Journals (Sweden)

    Fernanda M Rodríguez-Tornos

    Full Text Available A unique synaptic activity-responsive element (SARE sequence, composed of the consensus binding sites for SRF, MEF2 and CREB, is necessary for control of transcriptional upregulation of the Arc gene in response to synaptic activity. We hypothesize that this sequence is a broad mechanism that regulates gene expression in response to synaptic activation and during plasticity; and that analysis of SARE-containing genes could identify molecular mechanisms involved in brain disorders. To search for conserved SARE sequences in the mammalian genome, we used the SynoR in silico tool, and found the SARE cluster predominantly in the regulatory regions of genes expressed specifically in the nervous system; most were related to neural development and homeostatic maintenance. Two of these SARE sequences were tested in luciferase assays and proved to promote transcription in response to neuronal activation. Supporting the predictive capacity of our candidate list, up-regulation of several SARE containing genes in response to neuronal activity was validated using external data and also experimentally using primary cortical neurons and quantitative real time RT-PCR. The list of SARE-containing genes includes several linked to mental retardation and cognitive disorders, and is significantly enriched in genes that encode mRNA targeted by FMRP (fragile X mental retardation protein. Our study thus supports the idea that SARE sequences are relevant transcriptional regulatory elements that participate in plasticity. In addition, it offers a comprehensive view of how activity-responsive transcription factors coordinate their actions and increase the selectivity of their targets. Our data suggest that analysis of SARE-containing genes will reveal yet-undescribed pathways of synaptic plasticity and additional candidate genes disrupted in mental disease.

  8. Paradigms for parasite conservation.

    Science.gov (United States)

    Dougherty, Eric R; Carlson, Colin J; Bueno, Veronica M; Burgio, Kevin R; Cizauskas, Carrie A; Clements, Christopher F; Seidel, Dana P; Harris, Nyeema C

    2016-08-01

    Parasitic species, which depend directly on host species for their survival, represent a major regulatory force in ecosystems and a significant component of Earth's biodiversity. Yet the negative impacts of parasites observed at the host level have motivated a conservation paradigm of eradication, moving us farther from attainment of taxonomically unbiased conservation goals. Despite a growing body of literature highlighting the importance of parasite-inclusive conservation, most parasite species remain understudied, underfunded, and underappreciated. We argue the protection of parasitic biodiversity requires a paradigm shift in the perception and valuation of their role as consumer species, similar to that of apex predators in the mid-20th century. Beyond recognizing parasites as vital trophic regulators, existing tools available to conservation practitioners should explicitly account for the unique threats facing dependent species. We built upon concepts from epidemiology and economics (e.g., host-density threshold and cost-benefit analysis) to devise novel metrics of margin of error and minimum investment for parasite conservation. We define margin of error as the risk of accidental host extinction from misestimating equilibrium population sizes and predicted oscillations, while minimum investment represents the cost associated with conserving the additional hosts required to maintain viable parasite populations. This framework will aid in the identification of readily conserved parasites that present minimal health risks. To establish parasite conservation, we propose an extension of population viability analysis for host-parasite assemblages to assess extinction risk. In the direst cases, ex situ breeding programs for parasites should be evaluated to maximize success without undermining host protection. Though parasitic species pose a considerable conservation challenge, adaptations to conservation tools will help protect parasite biodiversity in the face of

  9. Identification and Functional Analysis of Gene Regulatory Sequences Interacting with Colorectal Tumor Suppressors

    DEFF Research Database (Denmark)

    Dahlgaard, Katja; Troelsen, Jesper

    2018-01-01

    Several tumor suppressors possess gene regulatory activity. Here, we describe how promoter and promoter/enhancer reporter assays can be used to characterize a colorectal tumor suppressor proteins’ gene regulatory activity of possible target genes. In the first part, a bioinformatic approach...... of the quick and efficient In-Fusion cloning method, and how to carry out transient transfections of Caco-2 colon cancer cells with the produced luciferase reporter plasmids using polyethyleneimine (PEI). A plan describing how to set up and carry out the luciferase expression assay is presented. The luciferase...... to identify relevant gene regulatory regions of potential target genes is presented. In the second part, it is demonstrated how to prepare and carry out the functional assay. We explain how to clone the bioinformatically identified gene regulatory regions into luciferase reporter plasmids by the use...

  10. In silico discovery of transcription regulatory elements in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Le Roch Karine G

    2008-02-01

    Full Text Available Abstract Background With the sequence of the Plasmodium falciparum genome and several global mRNA and protein life cycle expression profiling projects now completed, elucidating the underlying networks of transcriptional control important for the progression of the parasite life cycle is highly pertinent to the development of new anti-malarials. To date, relatively little is known regarding the specific mechanisms the parasite employs to regulate gene expression at the mRNA level, with studies of the P. falciparum genome sequence having revealed few cis-regulatory elements and associated transcription factors. Although it is possible the parasite may evoke mechanisms of transcriptional control drastically different from those used by other eukaryotic organisms, the extreme AT-rich nature of P. falciparum intergenic regions (~90% AT presents significant challenges to in silico cis-regulatory element discovery. Results We have developed an algorithm called Gene Enrichment Motif Searching (GEMS that uses a hypergeometric-based scoring function and a position-weight matrix optimization routine to identify with high-confidence regulatory elements in the nucleotide-biased and repeat sequence-rich P. falciparum genome. When applied to promoter regions of genes contained within 21 co-expression gene clusters generated from P. falciparum life cycle microarray data using the semi-supervised clustering algorithm Ontology-based Pattern Identification, GEMS identified 34 putative cis-regulatory elements associated with a variety of parasite processes including sexual development, cell invasion, antigenic variation and protein biosynthesis. Among these candidates were novel motifs, as well as many of the elements for which biological experimental evidence already exists in the Plasmodium literature. To provide evidence for the biological relevance of a cell invasion-related element predicted by GEMS, reporter gene and electrophoretic mobility shift assays

  11. Conserved hypothetical protein Rv1977 in Mycobacterium tuberculosis strains contains sequence polymorphisms and might be involved in ongoing immune evasion.

    Science.gov (United States)

    Jiang, Yi; Liu, Haican; Wang, Xuezhi; Li, Guilian; Qiu, Yan; Dou, Xiangfeng; Wan, Kanglin

    2015-01-01

    Host immune pressure and associated parasite immune evasion are key features of host-pathogen co-evolution. A previous study showed that human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved and thus it was deduced that M. tuberculosis lacks antigenic variation and immune evasion. Here, we selected 151 clinical Mycobacterium tuberculosis isolates from China, amplified gene encoding Rv1977 and compared the sequences. The results showed that Rv1977, a conserved hypothetical protein, is not conserved in M. tuberculosis strains and there are polymorphisms existed in the protein. Some mutations, especially one frameshift mutation, occurred in the antigen Rv1977, which is uncommon in M.tb strains and may lead to the protein function altering. Mutations and deletion in the gene all affect one of three T cell epitopes and the changed T cell epitope contained more than one variable position, which may suggest ongoing immune evasion.

  12. Regulatory Mechanisms of a Highly Pectinolytic Mutant of Penicillium occitanis and Functional Analysis of a Candidate Gene in the Plant Pathogen Fusarium oxysporum

    Directory of Open Access Journals (Sweden)

    Gustavo Bravo-Ruiz

    2017-09-01

    Full Text Available Penicillium occitanis is a model system for enzymatic regulation. A mutant strain exhibiting constitutive overproduction of different pectinolytic enzymes both under inducing (pectin or repressing conditions (glucose was previously isolated after chemical mutagenesis. In order to identify the molecular basis of this regulatory mechanism, the genomes of the wild type and the derived mutant strain were sequenced and compared, providing the first reference genome for this species. We used a phylogenomic approach to compare P. occitanis with other pectinolytic fungi and to trace expansions of gene families involved in carbohydrate degradation. Genome comparison between wild type and mutant identified seven mutations associated with predicted proteins. The most likely candidate was a mutation in a highly conserved serine residue of a conserved fungal protein containing a GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain and a fungus-specific transcription factor regulatory middle homology region. To functionally characterize the role of this candidate gene, the mutation was recapitulated in the predicted orthologue Fusarium oxysporum, a vascular wilt pathogen which secretes a wide array of plant cell wall degrading enzymes, including polygalacturonases, pectate lyases, xylanases and proteases, all of which contribute to infection. However, neither the null mutant nor a mutant carrying the analogous point mutation exhibited a deregulation of pectinolytic enzymes. The availability, annotation and phylogenomic analysis of the P. occitanis genome sequence represents an important resource for understanding the evolution and biology of this species, and sets the basis for the discovery of new genes of biotechnological interest for the degradation of complex polysaccharides.

  13. Nomadic enhancers: tissue-specific cis-regulatory elements of yellow have divergent genomic positions among Drosophila species.

    Directory of Open Access Journals (Sweden)

    Gizem Kalay

    2010-11-01

    Full Text Available cis-regulatory DNA sequences known as enhancers control gene expression in space and time. They are central to metazoan development and are often responsible for changes in gene regulation that contribute to phenotypic evolution. Here, we examine the sequence, function, and genomic location of enhancers controlling tissue- and cell-type specific expression of the yellow gene in six Drosophila species. yellow is required for the production of dark pigment, and its expression has evolved largely in concert with divergent pigment patterns. Using Drosophila melanogaster as a transgenic host, we examined the expression of reporter genes in which either 5' intergenic or intronic sequences of yellow from each species controlled the expression of Green Fluorescent Protein. Surprisingly, we found that sequences controlling expression in the wing veins, as well as sequences controlling expression in epidermal cells of the abdomen, thorax, and wing, were located in different genomic regions in different species. By contrast, sequences controlling expression in bristle-associated cells were located in the intron of all species. Differences in the precise pattern of spatial expression within the developing epidermis of D. melanogaster transformants usually correlated with adult pigmentation in the species from which the cis-regulatory sequences were derived, which is consistent with cis-regulatory evolution affecting yellow expression playing a central role in Drosophila pigmentation divergence. Sequence comparisons among species favored a model in which sequential nucleotide substitutions were responsible for the observed changes in cis-regulatory architecture. Taken together, these data demonstrate frequent changes in yellow cis-regulatory architecture among Drosophila species. Similar analyses of other genes, combining in vivo functional tests of enhancer activity with in silico comparative genomics, are needed to determine whether the pattern of

  14. The Regulatory Independence of FANR

    International Nuclear Information System (INIS)

    ALNuaimi, Fatema; Choi, Kwang Shik

    2012-01-01

    Regulatory independence is meant to provide a conservative system of policy making in order to comply with the problems that are forecasted upon the basis of assumptions. The Federal Authorization of Nuclear Regulation (FANR) is a regulatory commission that was formed to be regulatory body that governs the generation of nuclear power in United Arab Emirates. It was established under the UAE nuclear law (9/2009) as an independent regulatory body that was tasked with the regulation of all nuclear activities in the United Arab Emirates. As an independent body, FANR was tasked with ensuring that the regulation of the nuclear sector is done in effective and transparent manner to ensure its accountability to the people. Being independent, the regulatory body develops national nuclear regulations based on laid down safety standards by the International Atomic Energy Agency, ensuring that they are based on scientific and proven technologies The role of FANR is to ensure that the all corporations that undertake nuclear activities follow the laid down procedures and objectives and ensure safety measures are taken keenly to ensure the safety of the workers and the general public while at the same time ensuring the environment is free from nuclear radiations

  15. The Regulatory Independence of FANR

    Energy Technology Data Exchange (ETDEWEB)

    ALNuaimi, Fatema; Choi, Kwang Shik [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)

    2012-05-15

    Regulatory independence is meant to provide a conservative system of policy making in order to comply with the problems that are forecasted upon the basis of assumptions. The Federal Authorization of Nuclear Regulation (FANR) is a regulatory commission that was formed to be regulatory body that governs the generation of nuclear power in United Arab Emirates. It was established under the UAE nuclear law (9/2009) as an independent regulatory body that was tasked with the regulation of all nuclear activities in the United Arab Emirates. As an independent body, FANR was tasked with ensuring that the regulation of the nuclear sector is done in effective and transparent manner to ensure its accountability to the people. Being independent, the regulatory body develops national nuclear regulations based on laid down safety standards by the International Atomic Energy Agency, ensuring that they are based on scientific and proven technologies The role of FANR is to ensure that the all corporations that undertake nuclear activities follow the laid down procedures and objectives and ensure safety measures are taken keenly to ensure the safety of the workers and the general public while at the same time ensuring the environment is free from nuclear radiations

  16. In silico detection of sequence variations modifying transcriptional regulation.

    Directory of Open Access Journals (Sweden)

    Malin C Andersen

    2008-01-01

    Full Text Available Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers. The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.

  17. In Silico Detection of Sequence Variations Modifying Transcriptional Regulation

    Science.gov (United States)

    Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob

    2008-01-01

    Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319

  18. Lanthanum-Based Metal-Organic Frameworks for Specific Detection of Sudan Virus RNA Conservative Sequences down to Single-Base Mismatch.

    Science.gov (United States)

    Yang, Shui-Ping; Zhao, Wei; Hu, Pei-Pei; Wu, Ke-Yang; Jiang, Zhi-Hong; Bai, Li-Ping; Li, Min-Min; Chen, Jin-Xiang

    2017-12-18

    Reactions of La(NO 3 ) 3 ·6H 2 O with the polar, tritopic quaternized carboxylate ligands N-carboxymethyl-3,5-dicarboxylpyridinium bromide (H 3 CmdcpBr) and N-(4-carboxybenzyl)-3,5-dicarboxylpyridinium bromide (H 3 CbdcpBr) afford two water-stable metal-organic frameworks (MOFs) of {[La 4 (Cmdcp) 6 (H 2 O) 9 ]} n (1, 3D) and {[La 2 (Cbdcp) 3 (H 2 O) 10 ]} n (2, 2D). MOFs 1 and 2 absorb the carboxyfluorescein (FAM)-tagged probe DNA (P-DNA) and quench the fluorescence of FAM via a photoinduced electron transfer (PET) process. The nonemissive P-DNA@MOF hybrids thus formed in turn function as sensing platforms to distinguish conservative linear, single-stranded RNA sequences of Sudan virus with high selectivity and low detection limits of 112 and 67 pM, respectively (at a signal-to-noise ratio of 3). These hybrids also exhibit high specificity and discriminate down to single-base mismatch RNA sequences.

  19. Water conservation and allocation guideline for oilfield injection

    International Nuclear Information System (INIS)

    2006-01-01

    This paper was prepared as a guide for regulatory agencies and developers using non-saline water sources in enhanced oil recovery (EOR) schemes. A systems approach was used to achieve specific environmental outcomes that adhered to the Water Conservation and Allocation Policy for Oilfield Injection. The guide was applicable to licence renewal applications for projects operating and licensed to use non-saline water resources, as well as new licence applications for oilfield injection use. The guide provided recommended water conservation practices and application requirements, and outlined regulatory procedures and steps for obtaining a Water Act licence. The guideline was prepared to eliminate the use of non-saline water in EOR projects where feasible alternatives existed, as well as to identify areas with water shortages and reduce the use of non-saline water. The guide included monitoring and reporting requirements to improve the evaluation of water use practices and outlined current initiatives to address water conservation and research. It was concluded that outcomes from the program will include reliable quality water supplies for a sustainable economy, healthy aquatic ecosystems, and safe, secure drinking water supplies for Albertans. 3 tabs., 5 figs

  20. Comparative Evolution of Morphological Regulatory Functions in Candida Species

    Science.gov (United States)

    Lackey, Erika; Vipulanandan, Geethanjali; Childers, Delma S.

    2013-01-01

    Morphological transitions play an important role in virulence and virulence-related processes in a wide variety of pathogenic fungi, including the most commonly isolated human fungal pathogen Candida albicans. While environmental signals, transcriptional regulators, and target genes associated with C. albicans morphogenesis are well-characterized, considerably little is known about morphological regulatory mechanisms and the extent to which they are evolutionarily conserved in less pathogenic and less filamentous non-albicans Candida species (NACS). We have identified specific optimal filament-inducing conditions for three NACS (C. tropicalis, C. parapsilosis, and C. guilliermondii), which are very limited, suggesting that these species may be adapted for niche-specific filamentation in the host. Only a subset of evolutionarily conserved C. albicans filament-specific target genes were induced upon filamentation in C. tropicalis, C. parapsilosis, and C. guilliermondii. One of the genes showing conserved expression was UME6, a key filament-specific regulator of C. albicans hyphal development. Constitutive high-level expression of UME6 was sufficient to drive increased filamentation as well as biofilm formation and partly restore conserved filament-specific gene expression in both C. tropicalis and C. parapsilosis, suggesting that evolutionary differences in filamentation ability among pathogenic Candida species may be partially attributed to alterations in the expression level of a conserved filamentous growth machinery. In contrast to UME6, NRG1, an important repressor of C. albicans filamentation, showed only a partly conserved role in controlling NACS filamentation. Overall, our results suggest that C. albicans morphological regulatory functions are partially conserved in NACS and have evolved to respond to more specific sets of host environmental cues. PMID:23913541

  1. Comparative anatomy of the human APRT gene and enzyme: nucleotide sequence divergence and conservation of a nonrandom CpG dinucleotide arrangement

    International Nuclear Information System (INIS)

    Broderick, T.P.; Schaff, D.A.; Bertino, A.M.; Dush, M.K.; Tischfield, J.A.; Stambrook, P.J.

    1987-01-01

    The functional human adenine phosphoribosyltransferase (APRT) gene is <2.6 kilobases in length and contains five exons. The amino acid sequences of APRTs have been highly conserved throughout evolution. The human enzyme is 82%, 90%, and 40% identical to the mouse, hamster, and Escherichia coli enzymes, respectively. The promoter region of the human APRT gene, like that of several other housekeeping genes, lacks TATA and CCAAT boxes but contains five GC boxes that are potential binding sites for the Sp1 transcription factor. The distal three, however, are dispensable for gene expression. Comparison between human and mouse APRT gene nucleotide sequences reveals a high degree of homology within protein coding regions but an absence of significant homology in 5' flanking, 3' untranslated, and intron sequences, except for similarly positioned GC boxes in the promoter region and a 26-base-pair region in intron 3. This 26-base-pair sequence is 92% identical with a similarly positioned sequence in the mouse gene and is also found in intron 3 of the hamster gene, suggesting that its retention may be a consequence of stringent selection. The positions of all introns have been precisely retained in the human and both rodent genes. Retention of an elevated CpG dinucleotide content, despite loss of sequence homology, suggests that there may be selection for CpG dinucleotides in these regions and that their maintenance may be important for APRT gene function

  2. Inferring the conservative causal core of gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2010-09-01

    Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  3. Inferring the conservative causal core of gene regulatory networks.

    Science.gov (United States)

    Altay, Gökmen; Emmert-Streib, Frank

    2010-09-28

    Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  4. The highly conserved codon following the slippery sequence supports -1 frameshift efficiency at the HIV-1 frameshift site.

    Directory of Open Access Journals (Sweden)

    Suneeth F Mathew

    Full Text Available HIV-1 utilises -1 programmed ribosomal frameshifting to translate structural and enzymatic domains in a defined proportion required for replication. A slippery sequence, U UUU UUA, and a stem-loop are well-defined RNA features modulating -1 frameshifting in HIV-1. The GGG glycine codon immediately following the slippery sequence (the 'intercodon' contributes structurally to the start of the stem-loop but has no defined role in current models of the frameshift mechanism, as slippage is inferred to occur before the intercodon has reached the ribosomal decoding site. This GGG codon is highly conserved in natural isolates of HIV. When the natural intercodon was replaced with a stop codon two different decoding molecules-eRF1 protein or a cognate suppressor tRNA-were able to access and decode the intercodon prior to -1 frameshifting. This implies significant slippage occurs when the intercodon is in the (perhaps distorted ribosomal A site. We accommodate the influence of the intercodon in a model of frame maintenance versus frameshifting in HIV-1.

  5. Violation of an evolutionarily conserved immunoglobulin diversity gene sequence preference promotes production of dsDNA-specific IgG antibodies.

    Directory of Open Access Journals (Sweden)

    Aaron Silva-Sanchez

    Full Text Available Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3, which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH gene segment sequence content by reading frame (RF is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1, which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies.

  6. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute

    Directory of Open Access Journals (Sweden)

    Md. Tariqul Islam

    2015-01-01

    Full Text Available MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops.

  7. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    Science.gov (United States)

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  8. Facilitating genome navigation : survey sequencing and dense radiation-hybrid gene mapping

    NARCIS (Netherlands)

    Hitte, C; Madeoy, J; Kirkness, EF; Priat, C; Lorentzen, TD; Senger, F; Thomas, D; Derrien, T; Ramirez, C; Scott, C; Evanno, G; Pullar, B; Cadieu, E; Oza, [No Value; Lourgant, K; Jaffe, DB; Tacher, S; Dreano, S; Berkova, N; Andre, C; Deloukas, P; Fraser, C; Lindblad-Toh, K; Ostrander, EA; Galibert, F

    Accurate and comprehensive sequence coverage for large genomes has been restricted to only a few species of specific interest. Lower sequence coverage (survey sequencing) of related species can yield a wealth of information about gene content and putative regulatory elements. But survey sequences

  9. A Conserved MicroRNA Regulatory Circuit Is Differentially Controlled during Limb/Appendage Regeneration.

    Directory of Open Access Journals (Sweden)

    Benjamin L King

    Full Text Available Although regenerative capacity is evident throughout the animal kingdom, it is not equally distributed throughout evolution. For instance, complex limb/appendage regeneration is muted in mammals but enhanced in amphibians and teleosts. The defining characteristic of limb/appendage regenerative systems is the formation of a dedifferentiated tissue, termed blastema, which serves as the progenitor reservoir for regenerating tissues. In order to identify a genetic signature that accompanies blastema formation, we employ next-generation sequencing to identify shared, differentially regulated mRNAs and noncoding RNAs in three different, highly regenerative animal systems: zebrafish caudal fins, bichir pectoral fins and axolotl forelimbs.These studies identified a core group of 5 microRNAs (miRNAs that were commonly upregulated and 5 miRNAs that were commonly downregulated, as well as 4 novel tRNAs fragments with sequences conserved with humans. To understand the potential function of these miRNAs, we built a network of 1,550 commonly differentially expressed mRNAs that had functional relationships to 11 orthologous blastema-associated genes. As miR-21 was the most highly upregulated and most highly expressed miRNA in all three models, we validated the expression of known target genes, including the tumor suppressor, pdcd4, and TGFβ receptor subunit, tgfbr2 and novel putative target genes such as the anti-apoptotic factor, bcl2l13, Choline kinase alpha, chka and the regulator of G-protein signaling, rgs5.Our extensive analysis of RNA-seq transcriptome profiling studies in three regenerative animal models, that diverged in evolution ~420 million years ago, reveals a common miRNA-regulated genetic network of blastema genes. These comparative studies extend our current understanding of limb/appendage regeneration by identifying previously unassociated blastema genes and the extensive regulation by miRNAs, which could serve as a foundation for future

  10. The miRNAome of globe artichoke: conserved and novel micro RNAs and target analysis

    Directory of Open Access Journals (Sweden)

    De Paola Domenico

    2012-01-01

    Full Text Available Abstract Background Plant microRNAs (miRNAs are involved in post-transcriptional regulatory mechanisms of several processes, including the response to biotic and abiotic stress, often contributing to the adaptive response of the plant to adverse conditions. In addition to conserved miRNAs, found in a wide range of plant species a number of novel species-specific miRNAs, displaying lower levels of expression can be found. Due to low abundance, non conserved miRNAs are difficult to identify and isolate using conventional approaches. Conversely, deep-sequencing of small RNA (sRNA libraries can detect even poorly expressed miRNAs. No miRNAs from globe artichoke have been described to date. We analyzed the miRNAome from artichoke by deep sequencing four sRNA libraries obtained from NaCl stressed and control leaves and roots. Results Conserved and novel miRNAs were discovered using accepted criteria. The expression level of selected miRNAs was monitored by quantitative real-time PCR. Targets were predicted and validated for their cleavage site. A total of 122 artichoke miRNAs were identified, 98 (25 families of which were conserved with other plant species, and 24 were novel. Some miRNAs were differentially expressed according to tissue or condition, magnitude of variation after salt stress being more pronounced in roots. Target function was predicted by comparison to Arabidopsis proteins; the 43 targets (23 for novel miRNAs identified included transcription factors and other genes, most of which involved in the response to various stresses. An unusual cleaved transcript was detected for miR393 target, transport inhibitor response 1. Conclusions The miRNAome from artichoke, including novel miRNAs, was unveiled, providing useful information on the expression in different organs and conditions. New target genes were identified. We suggest that the generation of secondary short-interfering RNAs from miR393 target can be a general rule in the plant

  11. Structural Conservation Despite Huge Sequence Diversity Allows EPCR Binding by the PfEMP1 Family Implicated in Severe Childhood Malaria

    DEFF Research Database (Denmark)

    Lau, Clinton K.Y.; Turner, Louise; Jespersen, Jakob S.

    2015-01-01

    with severe childhood malaria. We combine crystal structures of CIDRa1:EPCR complexes with analysis of 885 CIDRa1 sequences, showing that the EPCR-binding surfaces of CIDRa1 domains are conserved in shape and bonding potential, despite dramatic sequence diversity. Additionally, these domains mimic features...... of the natural EPCR ligand and can block this ligand interaction. Using peptides corresponding to the EPCR-binding region, antibodies can be purified from individuals in malaria-endemic regions that block EPCR binding of diverse CIDRa1 variants. This highlights the extent to which such a surface protein family......The PfEMP1 family of surface proteins is central for Plasmodium falciparum virulence and must retain the ability to bind to host receptors while also diversifying to aid immune evasion. The interaction between CIDRa1 domains of PfEMP1 and endothelial protein C receptor (EPCR) is associated...

  12. Relative Stabilities of Conserved and Non-Conserved Structures in the OB-Fold Superfamily

    Directory of Open Access Journals (Sweden)

    Andrei T. Alexandrescu

    2009-05-01

    Full Text Available The OB-fold is a diverse structure superfamily based on a β-barrel motif that is often supplemented with additional non-conserved secondary structures. Previous deletion mutagenesis and NMR hydrogen exchange studies of three OB-fold proteins showed that the structural stabilities of sites within the conserved β-barrels were larger than sites in non-conserved segments. In this work we examined a database of 80 representative domain structures currently classified as OB-folds, to establish the basis of this effect. Residue-specific values were obtained for the number of Cα-Cα distance contacts, sequence hydrophobicities, crystallographic B-factors, and theoretical B-factors calculated from a Gaussian Network Model. All four parameters point to a larger average flexibility for the non-conserved structures compared to the conserved β-barrels. The theoretical B-factors and contact densities show the highest sensitivity.Our results suggest a model of protein structure evolution in which novel structural features develop at the periphery of conserved motifs. Core residues are more resistant to structural changes during evolution since their substitution would disrupt a larger number of interactions. Similar factors are likely to account for the differences in stability to unfolding between conserved and non-conserved structures.

  13. Current status of herbal product: Regulatory overview

    Science.gov (United States)

    Sharma, Sanjay

    2015-01-01

    A review of the regulatory status of herbal drugs/products was done for few countries forming part of Asia, Africa, America, Europe, and Australia, to understand various categories under which the trade of herbal products is permitted and their premarketing requirements. A critical assessment was done, to know the hindrances in the process of harmonization of herbal products. It has been found that there is a lack of harmonization in the regulatory requirements of herbal products internationally, besides the issues of availability of herbs and their conservation. These are hindering the international trade and growth of the herbal products segment. PMID:26681886

  14. Inhibition of Hepatitis C Virus in Mice by a Small Interfering RNA Targeting a Highly Conserved Sequence in Viral IRES Pseudoknot.

    Directory of Open Access Journals (Sweden)

    Jae-Su Moon

    Full Text Available The hepatitis C virus (HCV internal ribosome entry site (IRES that directs cap-independent viral translation is a primary target for small interfering RNA (siRNA-based HCV antiviral therapy. However, identification of potent siRNAs against HCV IRES by bioinformatics-based siRNA design is a challenging task given the complexity of HCV IRES secondary and tertiary structures and association with multiple proteins, which can also dynamically change the structure of this cis-acting RNA element. In this work, we utilized siRNA tiling approach whereby siRNAs were tiled with overlapping sequences that were shifted by one or two nucleotides over the HCV IRES stem-loop structures III and IV spanning nucleotides (nts 277-343. Based on their antiviral activity, we mapped a druggable region (nts 313-343 where the targets of potent siRNAs were enriched. siIE22, which showed the greatest anti-HCV potency, targeted a highly conserved sequence across diverse HCV genotypes, locating within the IRES subdomain IIIf involved in pseudoknot formation. Stepwise target shifting toward the 5' or 3' direction by 1 or 2 nucleotides reduced the antiviral potency of siIE22, demonstrating the importance of siRNA accessibility to this highly structured and sequence-conserved region of HCV IRES for RNA interference. Nanoparticle-mediated systemic delivery of the stability-improved siIE22 derivative gs_PS1 siIE22, which contains a single phosphorothioate linkage on the guide strand, reduced the serum HCV genome titer by more than 4 log10 in a xenograft mouse model for HCV replication without generation of resistant variants. Our results provide a strategy for identifying potent siRNA species against a highly structured RNA target and offer a potential pan-HCV genotypic siRNA therapy that might be beneficial for patients resistant to current treatment regimens.

  15. Comparative analysis of function and interaction of transcription factors in nematodes: Extensive conservation of orthology coupled to rapid sequence evolution

    Directory of Open Access Journals (Sweden)

    Singh Rama S

    2008-08-01

    Full Text Available Abstract Background Much of the morphological diversity in eukaryotes results from differential regulation of gene expression in which transcription factors (TFs play a central role. The nematode Caenorhabditis elegans is an established model organism for the study of the roles of TFs in controlling the spatiotemporal pattern of gene expression. Using the fully sequenced genomes of three Caenorhabditid nematode species as well as genome information from additional more distantly related organisms (fruit fly, mouse, and human we sought to identify orthologous TFs and characterized their patterns of evolution. Results We identified 988 TF genes in C. elegans, and inferred corresponding sets in C. briggsae and C. remanei, containing 995 and 1093 TF genes, respectively. Analysis of the three gene sets revealed 652 3-way reciprocal 'best hit' orthologs (nematode TF set, approximately half of which are zinc finger (ZF-C2H2 and ZF-C4/NHR types and HOX family members. Examination of the TF genes in C. elegans and C. briggsae identified the presence of significant tandem clustering on chromosome V, the majority of which belong to ZF-C4/NHR family. We also found evidence for lineage-specific duplications and rapid evolution of many of the TF genes in the two species. A search of the TFs conserved among nematodes in Drosophila melanogaster, Mus musculus and Homo sapiens revealed 150 reciprocal orthologs, many of which are associated with important biological processes and human diseases. Finally, a comparison of the sequence, gene interactions and function indicates that nematode TFs conserved across phyla exhibit significantly more interactions and are enriched in genes with annotated mutant phenotypes compared to those that lack orthologs in other species. Conclusion Our study represents the first comprehensive genome-wide analysis of TFs across three nematode species and other organisms. The findings indicate substantial conservation of transcription

  16. Decoupling mechanisms-paying for conservation

    Energy Technology Data Exchange (ETDEWEB)

    Cross, P.S.

    1993-07-15

    In 1988, the National Association of Regulatory Utility Commissioners issued a policy statement that said [open quotes]ratemaking practices should align utilities' pursuit of profit with least-cost planning.[close quotes] This policy coincided with then-current thinkingg at a number of state commissions about the much-touted goal of encouraging utilities to invest in conservation, or demand-side management (DSM) programs, rather than in generating resources to meet system load requirements. Besides utility concerns about recovering conservation program investments, regulators also notices a built-in [open quotes]disincentive[close quotes] to investment in the traditional ratemaking format: If profit is tied to sales, then utilities will always shy away from aggressively promoting conservation. Or so the thinkin went. [open quotes]Decoupling mechanisms[close quotes] were born to remove this disincentive. A number of states have implemented these mechanisms, while several others are investigating the issue. One chief drawback of the mechanisms is that if sales go down, rates go up to cover the shortfall. (Of course, rates go down if sales exceed forecasted levels.) A major problem has been that rate increases have occurred at exactly the wrong time, during economic slowdowns when utilities are struggling to retain price-sensitive customers and residential ratepayers are least likely to bear with quiet stoicism the burden placed on family budgets. Decoupling is seen by some as a step backwards in the move to competitive regulatory reforms that seek to encourage utilities to behave like free-market companies. Indeed, the newest decoupling mechanisms face serious challenge.

  17. HLA-E regulatory and coding region variability and haplotypes in a Brazilian population sample.

    Science.gov (United States)

    Ramalho, Jaqueline; Veiga-Castelli, Luciana C; Donadi, Eduardo A; Mendes-Junior, Celso T; Castelli, Erick C

    2017-11-01

    The HLA-E gene is characterized by low but wide expression on different tissues. HLA-E is considered a conserved gene, being one of the least polymorphic class I HLA genes. The HLA-E molecule interacts with Natural Killer cell receptors and T lymphocytes receptors, and might activate or inhibit immune responses depending on the peptide associated with HLA-E and with which receptors HLA-E interacts to. Variable sites within the HLA-E regulatory and coding segments may influence the gene function by modifying its expression pattern or encoded molecule, thus, influencing its interaction with receptors and the peptide. Here we propose an approach to evaluate the gene structure, haplotype pattern and the complete HLA-E variability, including regulatory (promoter and 3'UTR) and coding segments (with introns), by using massively parallel sequencing. We investigated the variability of 420 samples from a very admixed population such as Brazilians by using this approach. Considering a segment of about 7kb, 63 variable sites were detected, arranged into 75 extended haplotypes. We detected 37 different promoter sequences (but few frequent ones), 27 different coding sequences (15 representing new HLA-E alleles) and 12 haplotypes at the 3'UTR segment, two of them presenting a summed frequency of 90%. Despite the number of coding alleles, they encode mainly two different full-length molecules, known as E*01:01 and E*01:03, which corresponds to about 90% of all. In addition, differently from what has been previously observed for other non classical HLA genes, the relationship among the HLA-E promoter, coding and 3'UTR haplotypes is not straightforward because the same promoter and 3'UTR haplotypes were many times associated with different HLA-E coding haplotypes. This data reinforces the presence of only two main full-length HLA-E molecules encoded by the many HLA-E alleles detected in our population sample. In addition, this data does indicate that the distal HLA-E promoter is by

  18. Simple connection between conservation laws in the Korteweg--de Vriesand sine-Gordon systems

    International Nuclear Information System (INIS)

    Chodos, A.

    1980-01-01

    An infinite sequence of conserved quantities follows from the Lax representation in both the Korteweg--de Vries and sine-Gordon systems. We show that these two sequences are related by a simple substitution. In an appendix, two different methods of deriving conservation laws from the Lax representation are presented

  19. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  20. PipY, a Member of the Conserved COG0325 Family of PLP-Binding Proteins, Expands the Cyanobacterial Nitrogen Regulatory Network

    Directory of Open Access Journals (Sweden)

    José I. Labella

    2017-07-01

    Full Text Available Synechococcus elongatus PCC 7942 is a paradigmatic model organism for nitrogen regulation in cyanobacteria. Expression of genes involved in nitrogen assimilation is positively regulated by the 2-oxoglutarate receptor and global transcriptional regulator NtcA. Maximal activation requires the subsequent binding of the co-activator PipX. PII, a protein found in all three domains of life as an integrator of signals of the nitrogen and carbon balance, binds to PipX to counteract NtcA activity at low 2-oxoglutarate levels. PII-PipX complexes can also bind to the transcriptional regulator PlmA, whose regulon remains unknown. Here we expand the nitrogen regulatory network to PipY, encoded by the bicistronic operon pipXY in S. elongatus. Work with PipY, the cyanobacterial member of the widespread family of COG0325 proteins, confirms the conserved roles in vitamin B6 and amino/keto acid homeostasis and reveals new PLP-related phenotypes, including sensitivity to antibiotics targeting essential PLP-holoenzymes or synthetic lethality with cysK. In addition, the related phenotypes of pipY and pipX mutants are consistent with genetic interactions in the contexts of survival to PLP-targeting antibiotics and transcriptional regulation. We also showed that PipY overexpression increased the length of S. elongatus cells. Taken together, our results support a universal regulatory role for COG0325 proteins, paving the way to a better understanding of these proteins and of their connections with other biological processes.

  1. Identification and characterization of putative conserved IAM ...

    African Journals Online (AJOL)

    Available putative AMI sequences from a wide array of monocot and dicot plants were identified and the phylogenetic tree was constructed and analyzed. We identified in this tree, a clade that contained sequences from species across the plant kingdom suggesting that AMI is conserved and may have a primary role in plant ...

  2. In silico modeling of epigenetic-induced changes in photoreceptor cis-regulatory elements.

    Science.gov (United States)

    Hossain, Reafa A; Dunham, Nicholas R; Enke, Raymond A; Berndsen, Christopher E

    2018-01-01

    DNA methylation is a well-characterized epigenetic repressor of mRNA transcription in many plant and vertebrate systems. However, the mechanism of this repression is not fully understood. The process of transcription is controlled by proteins that regulate recruitment and activity of RNA polymerase by binding to specific cis-regulatory sequences. Cone-rod homeobox (CRX) is a well-characterized mammalian transcription factor that controls photoreceptor cell-specific gene expression. Although much is known about the functions and DNA binding specificity of CRX, little is known about how DNA methylation modulates CRX binding affinity to genomic cis-regulatory elements. We used bisulfite pyrosequencing of human ocular tissues to measure DNA methylation levels of the regulatory regions of RHO , PDE6B, PAX6 , and LINE1 retrotransposon repeats. To describe the molecular mechanism of repression, we used molecular modeling to illustrate the effect of DNA methylation on human RHO regulatory sequences. In this study, we demonstrate an inverse correlation between DNA methylation in regulatory regions adjacent to the human RHO and PDE6B genes and their subsequent transcription in human ocular tissues. Docking of CRX to the DNA models shows that CRX interacts with the grooves of these sequences, suggesting changes in groove structure could regulate binding. Molecular dynamics simulations of the RHO promoter and enhancer regions show changes in the flexibility and groove width upon epigenetic modification. Models also demonstrate changes in the local dynamics of CRX binding sites within RHO regulatory sequences which may account for the repression of CRX-dependent transcription. Collectively, these data demonstrate epigenetic regulation of CRX binding sites in human retinal tissue and provide insight into the mechanism of this mode of epigenetic regulation to be tested in future experiments.

  3. H-2RIIBP, a member of the nuclear hormone receptor superfamily that binds to both the regulatory element of major histocompatibility class I genes and the estrogen response element.

    OpenAIRE

    Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K

    1989-01-01

    Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, bu...

  4. The putative Leishmania telomerase RNA (LeishTER undergoes trans-splicing and contains a conserved template sequence.

    Directory of Open Access Journals (Sweden)

    Elton J R Vasconcelos

    Full Text Available Telomerase RNAs (TERs are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER that contains a 5' spliced leader (SL cap, a putative 3' polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5'SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs and its role in parasite telomere biology.

  5. Distinct forms of the β subunit of GTP-binding regulatory proteins identified by molecular cloning

    International Nuclear Information System (INIS)

    Fong, H.K.W.; Amatruda, T.T. III; Birren, B.W.; Simon, M.I.

    1987-01-01

    Two distinct β subunits of guanine nucleotide-binding regulatory proteins have been identified by cDNA cloning and are referred to as β 1 and β 1 subunits. The bovine transducin β subunit (β 1 ) has been cloned previously. The author now isolated and analyzed cDNA clones that encode the β 2 subunit from bovine adrenal, bovine brain, and a human myeloid leukemia cell line, HL-60. The 340-residue M/sub r/ 37,329 Β 2 protein is 90% identical with β 1 in predicted amino acid sequence, and it is also organized as a series of repetitive homologous segments. The major mRNA that encodes the bovine β 2 subunit is 1.7 kilobases in length. It is expressed at lower levels than β 1 subunit mRNA in all tissues examined. The β 1 and β 2 messages are expressed in cloned human cell lines. Hybridization of cDNA probes to bovine DNA showed that β 1 and β 2 are encoded by separate genes. The amino acid sequences for the bovine and human β 2 subunit are identical, as are the amino acid sequences for the bovine and human β 1 subunit. This evolutionary conservation suggests that the two β subunits have different roles in the signal transduction process

  6. Evolutionary conservation of P-selectin glycoprotein ligand-1 primary structure and function

    Directory of Open Access Journals (Sweden)

    Schapira Marc

    2007-09-01

    Full Text Available Abstract Background P-selectin glycoprotein ligand-1 (PSGL-1 plays a critical role in recruiting leukocytes in inflammatory lesions by mediating leukocyte rolling on selectins. Core-2 O-glycosylation of a N-terminal threonine and sulfation of at least one tyrosine residue of PSGL-1 are required for L- and P-selectin binding. Little information is available on the intra- and inter-species evolution of PSGL-1 primary structure. In addition, the evolutionary conservation of selectin binding site on PSGL-1 has not been previously examined in detail. Therefore, we performed multiple sequence alignment of PSGL-1 amino acid sequences of 14 mammals (human, chimpanzee, rhesus monkey, bovine, pig, rat, tree-shrew, bushbaby, mouse, bat, horse, cat, sheep and dog and examined mammalian PSGL-1 interactions with human selectins. Results A signal peptide was predicted in each sequence and a propeptide cleavage site was found in 9/14 species. PSGL-1 N-terminus is poorly conserved. However, each species exhibits at least one tyrosine sulfation site and, except in horse and dog, a T [D/E]PP [D/E] motif associated to the core-2 O-glycosylation of a N-terminal threonine. A mucin-like domain of 250–280 amino acids long was disclosed in all studied species. It lies between the conserved N-terminal O-glycosylated threonine (Thr-57 in human and the transmembrane domain, and contains a central region exhibiting a variable number of decameric repeats (DR. Interspecies and intraspecies polymorphisms were observed. Transmembrane and cytoplasmic domain sequences are well conserved. The moesin binding residues that serve as adaptor between PSGL-1 and Syk, and are involved in regulating PSGL-1-dependent rolling on P-selectin are perfectly conserved in all analyzed mammalian sequences. Despite a poor conservation of PSGL-1 N-terminal sequence, CHO cells co-expressing human glycosyltransferases and human, bovine, pig or rat PSGL-1 efficiently rolled on human L- or P

  7. Model uncertainty from a regulatory point of view

    International Nuclear Information System (INIS)

    Abramson, L.R.

    1994-01-01

    This paper discusses model uncertainty in the larger context of knowledge and random uncertainty. It explores some regulatory implications of model uncertainty and argues that, from a regulator's perspective, a conservative approach must be taken. As a consequence of this perspective, averaging over model results is ruled out

  8. Genome-wide analysis of the regulatory function mediated by the small regulatory psm-mec RNA of methicillin-resistant Staphylococcus aureus.

    Science.gov (United States)

    Cheung, Gordon Y C; Villaruz, Amer E; Joo, Hwang-Soo; Duong, Anthony C; Yeh, Anthony J; Nguyen, Thuan H; Sturdevant, Daniel E; Queck, S Y; Otto, M

    2014-07-01

    Several methicillin resistance (SCCmec) clusters characteristic of hospital-associated methicillin-resistant Staphylococcus aureus (MRSA) strains harbor the psm-mec locus. In addition to encoding the cytolysin, phenol-soluble modulin (PSM)-mec, this locus has been attributed gene regulatory functions. Here we employed genome-wide transcriptional profiling to define the regulatory function of the psm-mec locus. The immune evasion factor protein A emerged as the primary conserved and strongly regulated target of psm-mec, an effect we show is mediated by the psm-mec RNA. Furthermore, the psm-mec locus exerted regulatory effects that were more moderate in extent. For example, expression of PSM-mec limited expression of mecA, thereby decreasing methicillin resistance. Our study shows that the psm-mec locus has a rare dual regulatory RNA and encoded cytolysin function. Furthermore, our findings reveal a specific mechanism underscoring the recently emerging concept that S. aureus strains balance pronounced virulence and high expression of antibiotic resistance. Published by Elsevier GmbH.

  9. Conservation genetics of Iberian raptors

    Directory of Open Access Journals (Sweden)

    Martinez–Cruz, B.

    2011-12-01

    Full Text Available In this paper I provide an overview of conservation genetics and describe the management actions in the wild that can benefit from conservation genetic studies. I describe the genetic factors of risk for the survival of wild species, the consequences of loss of genetic diversity, inbreeding and outbreeding depression, and the use of genetic tools to delimitate units of conservation. Then I introduce the most common applications of conservation genetics in the management of wild populations. In a second part of the paper I review the conservation genetic studies carried on the Iberian raptors. I introduce several studies on the Spanish imperial eagle, the bearded vulture, the black vulture and the red kite that were carried out using autosomal microsatellite markers and mitochondrial DNA (mtDNA sequencing. I describe studies on the lesser kestrel and Egyptian vulture that additionally applied major histocompatibility complex (MHC markers, with the purpose of incorporating the study of non–neutral variation. For every species I explain how these studies can be and/or are applied in the strategy of conservation in the wild.

  10. A unified architecture of transcriptional regulatory elements

    DEFF Research Database (Denmark)

    Andersson, Robin; Sandelin, Albin Gustav; Danko, Charles G.

    2015-01-01

    Gene expression is precisely controlled in time and space through the integration of signals that act at gene promoters and gene-distal enhancers. Classically, promoters and enhancers are considered separate classes of regulatory elements, often distinguished by histone modifications. However...... and enhancers are considered a single class of functional element, with a unified architecture for transcription initiation. The context of interacting regulatory elements and the surrounding sequences determine local transcriptional output as well as the enhancer and promoter activities of individual elements....

  11. The phenotypic and molecular assessment of the non-conserved Arabidopsis MICRORNA163/S-ADENOSYL-METHYLTRANSFERASE regulatory module during biotic stress.

    Science.gov (United States)

    Litholdo, Celso Gaspar; Eamens, Andrew Leigh; Waterhouse, Peter Michael

    2018-04-01

    In plants, microRNAs (miRNAs) have evolved in parallel to the protein-coding genes that they target for expression regulation, and miRNA-directed gene expression regulation is central to almost every cellular process. MicroRNA, miR163, is unique to the Arabidopsis genus and is processed into a 24-nucleotide (nt) mature small regulatory RNA (sRNA) from a single precursor transcript transcribed from a single locus, the MIR163 gene. The MIR163 locus is a result of a recent inverted duplication event of one of the five closely related S-ADENOSYL-METHYLTRANSFERASE genes that the mature miR163 sRNA targets for expression regulation. Currently, however, little is known about the role of the miR163/S-ADENOSYL-METHYLTRANSFERASE regulatory module in response to biotic stress. Here, we document the expression domains of MIR163 and the S-ADENOSYL-METHYLTRANSFERASE target genes following fusion of their putative promoter sequences to the β-glucuronidase (GUS) reporter gene and subsequent in planta expression. Further, we report on our phenotypic and molecular assessment of Arabidopsis thaliana plants with altered miR163 accumulation, namely the mir163-1 and mir163-2 insertion knockout mutants and the miR163 overexpression line, the MIR163-OE plant. Finally, we reveal miR163 accumulation and S-ADENOSYL-METHYLTRANSFERASE target gene expression post treatment with the defence elicitors, salicylic acid and jasmonic acid, and following Fusarium oxysporum infection, wounding, and herbivory attack. Together, the work presented here provides a comprehensive new biological insight into the role played by the Arabidopsis genus-specific miR163/S-ADENOSYL-METHYLTRANSFERASE regulatory module in normal A. thaliana development and during the exposure of A. thaliana plants to biotic stress.

  12. Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.

    Science.gov (United States)

    Huang, Xin; Li, Hao-ming

    2009-08-05

    Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.

  13. Public Utility Regulatory Policies Act of 1978. Annual report to Congress

    Energy Technology Data Exchange (ETDEWEB)

    None,

    1980-05-01

    Titles I and III of the Public Utility Regulatory Policies Act of 1978 (PURPA) establish retail regulatory policies for electric and natural gas utilities, respectively, aimed at achieving three purposes: conservation of energy supplied by electric and gas utilities; efficiency in the use of facilities and resources by these utilities; equitable rates to electricity and natural gas consumers. PURPA also continues the pilot utility implementation program, authorized under Title II of the Energy Conservation and Production ACT (ECPA), to encourage adoption of cost-based rates and efficient energy-management practices. The purpose of this report is twofold: (1) to summarize and analyze the progress that state regulatory authorities and certain nonregulated utilities have made in their consideration of the PURPA standards; and (2) to summarize the Department of Energy (DOE) activities relating to PURPA and ECPA. The report provides a broad overview and assessment of the status of electric and gas regulation nationwide, and thus helps provide the basis for congressional and DOE actions targeted on the utility industry to address pressing national energy problems.

  14. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  15. Evolutionarily conserved regulation of TOR signalling.

    Science.gov (United States)

    Takahara, Terunao; Maeda, Tatsuya

    2013-07-01

    The target of rapamycin (TOR) is an evolutionarily conserved protein kinase that regulates cell growth in response to various environmental as well as intracellular cues through the formation of 2 distinct TOR complexes (TORC), TORC1 and TORC2. Dysregulation of TORC1 and TORC2 activity is closely associated with various diseases, including diabetes, cancer and neurodegenerative disorders. Over the past few years, new regulatory mechanisms of TORC1 and TORC2 activity have been elucidated. Furthermore, recent advances in the study of TOR inhibitors have revealed previously unrecognized cellular functions of TORC1. In this review, we briefly summarize the current understanding of the evolutionarily conserved TOR signalling from upstream regulators to downstream events.

  16. Technical support document: Energy conservation standards for consumer products: Dishwashers, clothes washers, and clothes dryers including: Environmental impacts; regulatory impact analysis

    Energy Technology Data Exchange (ETDEWEB)

    1990-12-01

    The Energy Policy and Conservation Act as amended (P.L. 94-163), establishes energy conservation standards for 12 of the 13 types of consumer products specifically covered by the Act. The legislation requires the Department of Energy (DOE) to consider new or amended standards for these and other types of products at specified times. This Technical Support Document presents the methodology, data and results from the analysis of the energy and economic impacts of standards on dishwashers, clothes washers, and clothes dryers. The economic impact analysis is performed in five major areas: An Engineering Analysis, which establishes technical feasibility and product attributes including costs of design options to improve appliance efficiency. A Consumer Analysis at two levels: national aggregate impacts, and impacts on individuals. The national aggregate impacts include forecasts of appliance sales, efficiencies, energy use, and consumer expenditures. The individual impacts are analyzed by Life-Cycle Cost (LCC), Payback Periods, and Cost of Conserved Energy (CCE), which evaluate the savings in operating expenses relative to increases in purchase price; A Manufacturer Analysis, which provides an estimate of manufacturers' response to the proposed standards. Their response is quantified by changes in several measures of financial performance for a firm. An Industry Impact Analysis shows financial and competitive impacts on the appliance industry. A Utility Analysis that measures the impacts of the altered energy-consumption patterns on electric utilities. A Environmental Effects analysis, which estimates changes in emissions of carbon dioxide, sulfur oxides, and nitrogen oxides, due to reduced energy consumption in the home and at the power plant. A Regulatory Impact Analysis collects the results of all the analyses into the net benefits and costs from a national perspective. 47 figs., 171 tabs. (JF)

  17. Functional comparison of the nematode Hox gene lin-39 in C. elegans and P. pacificus reveals evolutionary conservation of protein function despite divergence of primary sequences.

    Science.gov (United States)

    Grandien, K; Sommer, R J

    2001-08-15

    Hox transcription factors have been implicated in playing a central role in the evolution of animal morphology. Many studies indicate the evolutionary importance of regulatory changes in Hox genes, but little is known about the role of functional changes in Hox proteins. In the nematodes Pristionchus pacificus and Caenorhabditis elegans, developmental processes can be compared at the cellular, genetic, and molecular levels and differences in gene function can be identified. The Hox gene lin-39 is involved in the regulation of nematode vulva development. Comparison of known lin-39 mutations in P. pacificus and C. elegans revealed both conservation and changes of gene function. Here, we study evolutionary changes of lin-39 function using hybrid transgenes and site-directed mutagenesis in an in vivo assay using C. elegans lin-39 mutants. Our data show that despite the functional differences of LIN-39 between the two species, Ppa-LIN-39, when driven by Cel-lin-39 regulatory elements, can functionally replace Cel-lin-39. Furthermore, we show that the MAPK docking and phosphorylation motifs unique for Cel-LIN-39 are dispensable for Cel-lin-39 function. Therefore, the evolution of lin-39 function is driven by changes in regulatory elements rather than changes in the protein itself.

  18. Universal sequence map (USM of arbitrary discrete sequences

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2002-02-01

    Full Text Available Abstract Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM, is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR. The latter enables the representation of 4 unit type sequences (like DNA as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

  19. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their conserved exons. MATERIALS AND METHODS. Multiple sequence alignment. Sucrose synthase gene sequences of various cereals like rice, maize, and barley were accessed from NCBI Genbank database.

  20. Evaluating brief motivational and self-regulatory hand hygiene interventions: a cross-over longitudinal design.

    Science.gov (United States)

    Lhakhang, Pempa; Lippke, Sonia; Knoll, Nina; Schwarzer, Ralf

    2015-02-04

    Frequent handwashing can prevent infections, but non-compliance to hand hygiene is pervasive. Few theory- and evidence-based interventions to improve regular handwashing are available. Therefore, two intervention modules, a motivational and a self-regulatory one, were designed and evaluated. In a longitudinal study, 205 young adults, aged 18 to 26 years, were randomized into two intervention groups. The Mot-SelfR group received first a motivational intervention (Mot; risk perception and outcome expectancies) followed by a self-regulatory intervention (SelfR; perceived self-efficacy and planning) 17 days later. The SelfR-Mot group received the same two intervention modules in the opposite order. Follow-up data were assessed 17 and 34 days after the baseline. Both intervention sequences led to an increase in handwashing frequency, intention, self-efficacy, and planning. Also, overall gains were found for the self-regulatory module (increased planning and self-efficacy levels) and the motivational module (intention). Within groups, the self-regulatory module appeared to be more effective than the motivational module, independent of sequence. Self-regulatory interventions can help individuals to exhibit more handwashing. Sequencing may be important as a motivation module (Mot) first helps to set the goal and a self-regulatory module (SelfR) then helps to translate this goal into actual behavior, but further research is needed to evaluate mechanisms.

  1. Identifying noncoding risk variants using disease-relevant gene regulatory networks.

    Science.gov (United States)

    Gao, Long; Uzun, Yasin; Gao, Peng; He, Bing; Ma, Xiaoke; Wang, Jiahui; Han, Shizhong; Tan, Kai

    2018-02-16

    Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.

  2. Nsite, NsiteH and NsiteM Computer Tools for Studying Tran-scription Regulatory Elements

    KAUST Repository

    Shahmuradov, Ilham

    2015-07-02

    Summary: Gene transcription is mostly conducted through interactions of various transcription factors and their binding sites on DNA (regulatory elements, REs). Today, we are still far from understanding the real regulatory content of promoter regions. Computer methods for identification of REs remain a widely used tool for studying and understanding transcriptional regulation mechanisms. The Nsite, NsiteH and NsiteM programs perform searches for statistically significant (non-random) motifs of known human, animal and plant one-box and composite REs in a single genomic sequence, in a pair of aligned homologous sequences and in a set of functionally related sequences, respectively.

  3. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  4. The utility of transcriptomics in fish conservation.

    Science.gov (United States)

    Connon, Richard E; Jeffries, Ken M; Komoroske, Lisa M; Todgham, Anne E; Fangue, Nann A

    2018-01-29

    There is growing recognition of the need to understand the mechanisms underlying organismal resilience (i.e. tolerance, acclimatization) to environmental change to support the conservation management of sensitive and economically important species. Here, we discuss how functional genomics can be used in conservation biology to provide a cellular-level understanding of organismal responses to environmental conditions. In particular, the integration of transcriptomics with physiological and ecological research is increasingly playing an important role in identifying functional physiological thresholds predictive of compensatory responses and detrimental outcomes, transforming the way we can study issues in conservation biology. Notably, with technological advances in RNA sequencing, transcriptome-wide approaches can now be applied to species where no prior genomic sequence information is available to develop species-specific tools and investigate sublethal impacts that can contribute to population declines over generations and undermine prospects for long-term conservation success. Here, we examine the use of transcriptomics as a means of determining organismal responses to environmental stressors and use key study examples of conservation concern in fishes to highlight the added value of transcriptome-wide data to the identification of functional response pathways. Finally, we discuss the gaps between the core science and policy frameworks and how thresholds identified through transcriptomic evaluations provide evidence that can be more readily used by resource managers. © 2018. Published by The Company of Biologists Ltd.

  5. Iterative reconstruction of transcriptional regulatory networks: an algorithmic approach.

    Directory of Open Access Journals (Sweden)

    Christian L Barrett

    2006-05-01

    Full Text Available The number of complete, publicly available genome sequences is now greater than 200, and this number is expected to rapidly grow in the near future as metagenomic and environmental sequencing efforts escalate and the cost of sequencing drops. In order to make use of this data for understanding particular organisms and for discerning general principles about how organisms function, it will be necessary to reconstruct their various biochemical reaction networks. Principal among these will be transcriptional regulatory networks. Given the physical and logical complexity of these networks, the various sources of (often noisy data that can be utilized for their elucidation, the monetary costs involved, and the huge number of potential experiments approximately 10(12 that can be performed, experiment design algorithms will be necessary for synthesizing the various computational and experimental data to maximize the efficiency of regulatory network reconstruction. This paper presents an algorithm for experimental design to systematically and efficiently reconstruct transcriptional regulatory networks. It is meant to be applied iteratively in conjunction with an experimental laboratory component. The algorithm is presented here in the context of reconstructing transcriptional regulation for metabolism in Escherichia coli, and, through a retrospective analysis with previously performed experiments, we show that the produced experiment designs conform to how a human would design experiments. The algorithm is able to utilize probability estimates based on a wide range of computational and experimental sources to suggest experiments with the highest potential of discovering the greatest amount of new regulatory knowledge.

  6. Tissue-specific expression and regulatory networks of pig microRNAome.

    Directory of Open Access Journals (Sweden)

    Paolo Martini

    Full Text Available BACKGROUND: Despite the economic and medical importance of the pig, knowledge about its genome organization, gene expression regulation, and molecular mechanisms involved in physiological processes is far from that achieved for mouse and rat, the two most used model organisms in biomedical research. MicroRNAs (miRNAs are a wide class of molecules that exert a recognized role in gene expression modulation, but only 280 miRNAs in pig have been characterized to date. RESULTS: We applied a novel computational approach to predict species-specific and conserved miRNAs in the pig genome, which were then subjected to experimental validation. We experimentally identified candidate miRNAs sequences grouped in high-confidence (424 and medium-confidence (353 miRNAs according to RNA-seq results. A group of miRNAs was also validated by PCR experiments. We established the subtle variability in expression of isomiRs and miRNA-miRNA star couples supporting a biological function for these molecules. Finally, miRNA and mRNA expression profiles produced from the same sample of 20 different tissue of the animal were combined, using a correlation threshold to filter miRNA-target predictions, to identify tissue-specific regulatory networks. CONCLUSIONS: Our data represent a significant progress in the current understanding of miRNAome in pig. The identification of miRNAs, their target mRNAs, and the construction of regulatory circuits will provide new insights into the complex biological networks in several tissues of this important animal model.

  7. Elucidating the Small Regulatory RNA Repertoire of the Sea Anemone Anemonia viridis Based on Whole Genome and Small RNA Sequencing.

    Science.gov (United States)

    Urbarova, Ilona; Patel, Hardip; Forêt, Sylvain; Karlsen, Bård Ove; Jørgensen, Tor Erik; Hall-Spencer, Jason M; Johansen, Steinar D

    2018-02-01

    Cnidarians harbor a variety of small regulatory RNAs that include microRNAs (miRNAs) and PIWI-interacting RNAs (piRNAs), but detailed information is limited. Here, we report the identification and expression of novel miRNAs and putative piRNAs, as well as their genomic loci, in the symbiotic sea anemone Anemonia viridis. We generated a draft assembly of the A. viridis genome with putative size of 313 Mb that appeared to be composed of about 36% repeats, including known transposable elements. We detected approximately equal fractions of DNA transposons and retrotransposons. Deep sequencing of small RNA libraries constructed from A. viridis adults sampled at a natural CO2 gradient off Vulcano Island, Italy, identified 70 distinct miRNAs. Eight were homologous to previously reported miRNAs in cnidarians, whereas 62 appeared novel. Nine miRNAs were recognized as differentially expressed along the natural seawater pH gradient. We found a highly abundant and diverse population of piRNAs, with a substantial fraction showing ping-pong signatures. We identified nearly 22% putative piRNAs potentially targeting transposable elements within the A. viridis genome. The A. viridis genome appeared similar in size to that of other hexacorals with a very high divergence of transposable elements resembling that of the sea anemone genus Exaiptasia. The genome encodes and expresses a high number of small regulatory RNAs, which include novel miRNAs and piRNAs. Differentially expressed small RNAs along the seawater pH gradient indicated regulatory gene responses to environmental stressors. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Inducible nitric oxide synthase (iNOS) regulatory region variation in non-human primates.

    Science.gov (United States)

    Roodgar, Morteza; Ross, Cody T; Kenyon, Nicholas J; Marcelino, Gretchen; Smith, David Glenn

    2015-04-01

    Inducible nitric oxide synthase (iNOS) is an enzyme that plays a key role in intracellular immune response against respiratory infections. Since various species of nonhuman primates exhibit different levels of susceptibility to infectious respiratory diseases, and since variation in regulatory regions of genes is thought to play a key role in expression levels of genes, two candidate regulatory regions of iNOS were mapped, sequenced, and compared across five species of nonhuman primates: African green monkeys (Chlorocebus sabaeus), pig-tailed macaques (Macaca nemestrina), cynomolgus macaques (Macaca fascicularis), Indian rhesus macaques (Macaca mulatta), and Chinese rhesus macaques (M. mulatta). In addition, we conducted an in silico analysis of the transcription factor binding sites associated with genetic variation in these two candidate regulatory regions across species. We found that only one of the two candidate regions showed strong evidence of involvement in iNOS regulation. Specifically, we found evidence of 13 conserved binding site candidates linked to iNOS regulation: AP-1, C/EBPB, CREB, GATA-1, GATA-3, NF-AT, NF-AT5, NF-κB, KLF4, Oct-1, PEA3, SMAD3, and TCF11. Additionally, we found evidence of interspecies variation in binding sites for several regulatory elements linked to iNOS (GATA-3, GATA-4, KLF6, SRF, STAT-1, STAT-3, OLF-1 and HIF-1) across species, especially in African green monkeys relative to other species. Given the key role of iNOS in respiratory immune response, the findings of this study might help guide the direction of future studies aimed to uncover the molecular mechanisms underlying the increased susceptibility of African green monkeys to several viral and bacterial respiratory infections. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. Radiation and the regulatory landscape of neo2-Darwinism

    International Nuclear Information System (INIS)

    Rollo, C. David

    2006-01-01

    Several recently revealed features of eukaryotic genomes were not predicted by earlier evolutionary paradigms, including the relatively small number of genes, the very large amounts of non-functional code and its quarantine in heterochromatin, the remarkable conservation of many functionally important genes across relatively enormous phylogenetic distances, and the prevalence of extra-genomic information associated with chromatin structure and histone proteins. All of these emphasize a paramount role for regulatory evolution, which is further reinforced by recent perspectives highlighting even higher-order regulation governing epigenetics and development (EVO-DEVO). Modern neo 2 -Darwinism, with its emphasis on regulatory mechanisms and regulatory evolution provides new vision for understanding radiation biology, particularly because free radicals and redox states are central to many regulatory mechanisms and free radicals generated by radiation mimic and amplify endogenous signalling. This paper explores some of these aspects and their implications for low-dose radiation biology

  10. Radiation and the regulatory landscape of neo2-Darwinism.

    Science.gov (United States)

    Rollo, C David

    2006-05-11

    Several recently revealed features of eukaryotic genomes were not predicted by earlier evolutionary paradigms, including the relatively small number of genes, the very large amounts of non-functional code and its quarantine in heterochromatin, the remarkable conservation of many functionally important genes across relatively enormous phylogenetic distances, and the prevalence of extra-genomic information associated with chromatin structure and histone proteins. All of these emphasize a paramount role for regulatory evolution, which is further reinforced by recent perspectives highlighting even higher-order regulation governing epigenetics and development (EVO-DEVO). Modern neo2-Darwinism, with its emphasis on regulatory mechanisms and regulatory evolution provides new vision for understanding radiation biology, particularly because free radicals and redox states are central to many regulatory mechanisms and free radicals generated by radiation mimic and amplify endogenous signalling. This paper explores some of these aspects and their implications for low-dose radiation biology.

  11. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  12. Genome-wide identification of regulatory elements and reconstruction of gene regulatory networks of the green alga Chlamydomonas reinhardtii under carbon deprivation.

    Directory of Open Access Journals (Sweden)

    Flavia Vischi Winck

    Full Text Available The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1 gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF and transcription regulator (TR genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1 and Lcr2 (Low-CO2 response regulator 2, may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome

  13. Isolation, sequence identification and tissue expression profile of a ...

    African Journals Online (AJOL)

    The complete expressed sequence tag (CDS) sequence of Banna mini-pig inbred line (BMI) ribokinase gene (RBKS) was amplified using the reverse transcription-polymerase chain reaction (RT-PCR) based on the conserved sequence information of the cattle or other mammals and known highly homologous swine ESTs.

  14. Computational and molecular dissection of an X-box cis-Regulatory module

    OpenAIRE

    Warrington, Timothy Burton

    2015-01-01

    Ciliopathies are a class of human diseases marked by dysfunction of the cellular organelle, cilia. While many of the molecular components that make up cilia have been identified and studied, comparatively little is understood about the transcriptional regulation of genes encoding these components. The conserved transcription factor Regulatory Factor X (RFX)/DAF-19, which acts through binding to the cis-regulatory motif known as X-box, has been shown to regulate ciliary genes in many animals f...

  15. The transcriptional regulatory network mediated by banana (Musa acuminata) dehydration-responsive element binding (MaDREB) transcription factors in fruit ripening.

    Science.gov (United States)

    Kuang, Jian-Fei; Chen, Jian-Ye; Liu, Xun-Cheng; Han, Yan-Chao; Xiao, Yun-Yi; Shan, Wei; Tang, Yang; Wu, Ke-Qiang; He, Jun-Xian; Lu, Wang-Jin

    2017-04-01

    Fruit ripening is a complex, genetically programmed process involving the action of critical transcription factors (TFs). Despite the established significance of dehydration-responsive element binding (DREB) TFs in plant abiotic stress responses, the involvement of DREBs in fruit ripening is yet to be determined. Here, we identified four genes encoding ripening-regulated DREB TFs in banana (Musa acuminata), MaDREB1, MaDREB2, MaDREB3, and MaDREB4, and demonstrated that they play regulatory roles in fruit ripening. We showed that MaDREB1-MaDREB4 are nucleus-localized, induced by ethylene and encompass transcriptional activation activities. We performed a genome-wide chromatin immunoprecipitation and high-throughput sequencing (ChIP-Seq) experiment for MaDREB2 and identified 697 genomic regions as potential targets of MaDREB2. MaDREB2 binds to hundreds of loci with diverse functions and its binding sites are distributed in the promoter regions proximal to the transcriptional start site (TSS). Most of the MaDREB2-binding targets contain the conserved (A/G)CC(G/C)AC motif and MaDREB2 appears to directly regulate the expression of a number of genes involved in fruit ripening. In combination with transcriptome profiling (RNA sequencing) data, our results indicate that MaDREB2 may serve as both transcriptional activator and repressor during banana fruit ripening. In conclusion, our study suggests a hierarchical regulatory model of fruit ripening in banana and that the MaDREB TFs may act as transcriptional regulators in the regulatory network. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  16. Total sequence decomposition distinguishes functional modules, "molegos" in apurinic/apyrimidinic endonucleases

    Directory of Open Access Journals (Sweden)

    Braun Werner

    2002-11-01

    Full Text Available Abstract Background Total sequence decomposition, using the web-based MASIA tool, identifies areas of conservation in aligned protein sequences. By structurally annotating these motifs, the sequence can be parsed into individual building blocks, molecular legos ("molegos", that can eventually be related to function. Here, the approach is applied to the apurinic/apyrimidinic endonuclease (APE DNA repair proteins, essential enzymes that have been highly conserved throughout evolution. The APEs, DNase-1 and inositol 5'-polyphosphate phosphatases (IPP form a superfamily that catalyze metal ion based phosphorolysis, but recognize different substrates. Results MASIA decomposition of APE yielded 12 sequence motifs, 10 of which are also structurally conserved within the family and are designated as molegos. The 12 motifs include all the residues known to be essential for DNA cleavage by APE. Five of these molegos are sequentially and structurally conserved in DNase-1 and the IPP family. Correcting the sequence alignment to match the residues at the ends of two of the molegos that are absolutely conserved in each of the three families greatly improved the local structural alignment of APEs, DNase-1 and synaptojanin. Comparing substrate/product binding of molegos common to DNase-1 showed that those distinctive for APEs are not directly involved in cleavage, but establish protein-DNA interactions 3' to the abasic site. These additional bonds enhance both specific binding to damaged DNA and the processivity of APE1. Conclusion A modular approach can improve structurally predictive alignments of homologous proteins with low sequence identity and reveal residues peripheral to the traditional "active site" that control the specificity of enzymatic activity.

  17. Conserving energy in new buildings: analysis of nonregulatory policies

    Energy Technology Data Exchange (ETDEWEB)

    Scheer, R.M.; Nieves, L.A.; Mazzucchi, R.P.

    1981-05-01

    The costs and effectiveness of non-regulatory options relative to those of a regulatory approach are analyzed. Nonregulatory program alternatives identified are: information and education programs, tax incentives and disincentives, and mortage and finance programs. Chapter 2 briefly reviews survey data to assess present public awareness of energy issues and energy-efficient building design. Homebuyer and homebuilder surveys are reviewed and conservation motivations are discussed. Chapter 3 examines the provision of technical and economic information to various factors affecting building design decisions. This approach assumes that the economic incentives and technical means to achieve energy conservation goals already exist but that critical information is lacking. Chapter 4 examines how adjustments to the tax structure could enhance economic incentives and counter economic disincentives for energy conservation. Qualifying buildings for tax benefits would almost certainly require certification of design energy consumption. The effectiveness of tax incentives would depend in part on dissemination of public information regarding the incentives. Chapter 5 examines subsidies, such as subsidized mortgages and loan guarantees, which lower the cost of money or other costs but do not change the market structure facing the consumer. Certification that buildings qualify for such treatment would probably be required. Chapter 6 presents recommendations based on the study's findings. (MCW)

  18. In silico Analysis of 3′-End-Processing Signals in Aspergillus oryzae Using Expressed Sequence Tags and Genomic Sequencing Data

    Science.gov (United States)

    Tanaka, Mizuki; Sakai, Yoshifumi; Yamada, Osamu; Shintani, Takahiro; Gomi, Katsuya

    2011-01-01

    To investigate 3′-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3′-untranslated region (3′ UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3′ UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3′ UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15–30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3′-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3′-end-processing signals are similar to those in yeast and plants, some notable differences exist between them. PMID:21586533

  19. Both positive and negative regulatory elements mediate expression of a photoregulated CAB gene from Nicotiana plumbaginifolia.

    Science.gov (United States)

    Castresana, C; Garcia-Luque, I; Alonso, E; Malik, V S; Cashmore, A R

    1988-01-01

    We have analyzed promoter regulatory elements from a photoregulated CAB gene (Cab-E) isolated from Nicotiana plumbaginifolia. These studies have been performed by introducing chimeric gene constructs into tobacco cells via Agrobacterium tumefaciens-mediated transformation. Expression studies on the regenerated transgenic plants have allowed us to characterize three positive and one negative cis-acting elements that influence photoregulated expression of the Cab-E gene. Within the upstream sequences we have identified two positive regulatory elements (PRE1 and PRE2) which confer maximum levels of photoregulated expression. These sequences contain multiple repeated elements related to the sequence-ACCGGCCCACTT-. We have also identified within the upstream region a negative regulatory element (NRE) extremely rich in AT sequences, which reduces the level of gene expression in the light. We have defined a light regulatory element (LRE) within the promoter region extending from -396 to -186 bp which confers photoregulated expression when fused to a constitutive nopaline synthase ('nos') promoter. Within this region there is a 132-bp element, extending from -368 to -234 bp, which on deletion from the Cab-E promoter reduces gene expression from high levels to undetectable levels. Finally, we have demonstrated for a full length Cab-E promoter conferring high levels of photoregulated expression, that sequences proximal to the Cab-E TATA box are not replaceable by corresponding sequences from a 'nos' promoter. This contrasts with the apparent equivalence of these Cab-E and 'nos' TATA box-proximal sequences in truncated promoters conferring low levels of photoregulated expression. Images PMID:2901343

  20. Steam Generator tube integrity -- US Nuclear Regulatory Commission perspective

    International Nuclear Information System (INIS)

    Murphy, E.L.; Sullivan, E.J.

    1997-01-01

    In the US, the current regulatory framework was developed in the 1970s when general wall thinning was the dominant degradation mechanism; and, as a result of changes in the forms of degradation being observed and improvements in inspection and tube repair technology, the regulatory framework needs to be updated. Operating experience indicates that the current U.S. requirements should be more stringent in some areas, while in other areas they are overly conservative. To date, this situation has been dealt with on a plant-specific basis in the US. However, the NRC staff is now developing a proposed steam generator rule as a generic framework for ensuring that the steam generator tubes are capable of performing their intended safety functions. This paper discusses the current U.S. regulatory framework for assuring steam generator (SG) tube integrity, the need to update this regulatory framework, the objectives of the new proposed rule, the US Nuclear Regulatory Commission (NRC) regulatory guide (RG) that will accompany the rule, how risk considerations affect the development of the new rule, and some outstanding issues relating to the rule that the NRC is still dealing with

  1. PlantCARE, a plant cis-acting regulatory element database

    OpenAIRE

    Rombauts, Stephane; Déhais, Patrice; Van Montagu, Marc; Rouzé, Pierre

    1999-01-01

    PlantCARE is a database of plant cis- acting regulatory elements, enhancers and repressors. Besides the transcription motifs found on a sequence, it also offers a link to the EMBL entry that contains the full gene sequence as well as a description of the conditions in which a motif becomes functional. The information on these sites is given by matrices, consensus and individual site sequences on particular genes, depending on the available information. PlantCARE is a relational database avail...

  2. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Science.gov (United States)

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  3. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  4. Sequence of a cDNA encoding turtle high mobility group 1 protein.

    Science.gov (United States)

    Zheng, Jifang; Hu, Bi; Wu, Duansheng

    2005-07-01

    In order to understand sequence information about turtle HMG1 gene, a cDNA encoding HMG1 protein of the Chinese soft-shell turtle (Pelodiscus sinensis) was amplified by RT-PCR from kidney total RNA, and was cloned, sequenced and analyzed. The results revealed that the open reading frame (ORF) of turtle HMG1 cDNA is 606 bp long. The ORF codifies 202 amino acid residues, from which two DNA-binding domains and one polyacidic region are derived. The DNA-binding domains share higher amino acid identity with homologues sequences of chicken (96.5%) and mammalian (74%) than homologues sequence of rainbow trout (67%). The polyacidic region shows 84.6% amino acid homology with the equivalent region of chicken HMG1 cDNA. Turtle HMG1 protein contains 3 Cys residues located at completely conserved positions. Conservation in sequence and structure suggests that the functions of turtle HMG1 cDNA may be highly conserved during evolution. To our knowledge, this is the first report of HMG1 cDNA sequence in any reptilian.

  5. Alberta`s petroleum industry and the Conservation Board

    Energy Technology Data Exchange (ETDEWEB)

    Breen, D.H.

    1993-12-31

    The history of Alberta`s petroleum industry and Energy Resources Conservation Board (ERCB) was told. The conservation movement in Alberta was tracked from 1908 to the founding of the Petroleum and Natural Gas Conservation Board in 1938. Failure of Alberta`s first proration, and the Turner Valley `waste` gas conservation movement occurred during this period. The Leduc discovery and effects of the new regulatory environment on its development were discussed. The natural gas export debate, and the expansion of Alberta`s crude oil market were recounted in detail. The organization and regulation of field development which occurred during the period from 1948 to 1959 was presented. Past actions of the Petroleum and Natural Gas Conservation Board reviewed from today`s perspective. The petroleum industry and the ERCB were said to have been jointly responsible for the creation of a prosperous and confident new Alberta, moving it further and further away from the Canadian economic and political mainstream,, and reinforcing the sense of alienation that began to develop during the preceding agrarian decades. 53 figs., 48 tabs.

  6. Deep sequencing-based identification of small regulatory RNAs in Synechocystis sp. PCC 6803.

    Directory of Open Access Journals (Sweden)

    Wen Xu

    Full Text Available Synechocystis sp. PCC 6803 is a genetically tractable model organism for photosynthesis research. The genome of Synechocystis sp. PCC 6803 consists of a circular chromosome and seven plasmids. The importance of small regulatory RNAs (sRNAs as mediators of a number of cellular processes in bacteria has begun to be recognized. However, little is known regarding sRNAs in Synechocystis sp. PCC 6803. To provide a comprehensive overview of sRNAs in this model organism, the sRNAs of Synechocystis sp. PCC 6803 were analyzed using deep sequencing, and 7,951,189 reads were obtained. High quality mapping reads (6,127,890 were mapped onto the genome and assembled into 16,192 transcribed regions (clusters based on read overlap. A total number of 5211 putative sRNAs were revealed from the genome and the 4 megaplasmids, and 27 of these molecules, including four from plasmids, were confirmed by RT-PCR. In addition, possible target genes regulated by all of the putative sRNAs identified in this study were predicted by IntaRNA and analyzed for functional categorization and biological pathways, which provided evidence that sRNAs are indeed involved in many different metabolic pathways, including basic metabolic pathways, such as glycolysis/gluconeogenesis, the citrate cycle, fatty acid metabolism and adaptations to environmentally stress-induced changes. The information from this study provides a valuable reservoir for understanding the sRNA-mediated regulation of the complex physiology and metabolic processes of cyanobacteria.

  7. An Organismal Model for Gene Regulatory Networks in the Gut-Associated Immune Response

    Directory of Open Access Journals (Sweden)

    Katherine M. Buckley

    2017-10-01

    Full Text Available The gut epithelium is an ancient site of complex communication between the animal immune system and the microbial world. While elements of self-non-self receptors and effector mechanisms differ greatly among animal phyla, some aspects of recognition, regulation, and response are broadly conserved. A gene regulatory network (GRN approach provides a means to investigate the nature of this conservation and divergence even as more peripheral functional details remain incompletely understood. The sea urchin embryo is an unparalleled experimental model for detangling the GRNs that govern embryonic development. By applying this theoretical framework to the free swimming, feeding larval stage of the purple sea urchin, it is possible to delineate the conserved regulatory circuitry that regulates the gut-associated immune response. This model provides a morphologically simple system in which to efficiently unravel regulatory connections that are phylogenetically relevant to immunity in vertebrates. Here, we review the organism-wide cellular and transcriptional immune response of the sea urchin larva. A large set of transcription factors and signal systems, including epithelial expression of interleukin 17 (IL17, are important mediators in the activation of the early gut-associated response. Many of these have homologs that are active in vertebrate immunity, while others are ancient in animals but absent in vertebrates or specific to echinoderms. This larval model provides a means to experimentally characterize immune function encoded in the sea urchin genome and the regulatory interconnections that control immune response and resolution across the tissues of the organism.

  8. Using hexamers to predict cis-regulatory motifs in Drosophila

    Directory of Open Access Journals (Sweden)

    Kibler Dennis

    2005-10-01

    Full Text Available Abstract Background Cis-regulatory modules (CRMs are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered. Results We present a simple, efficient method (HexDiff based only on hexamer frequencies of known CRMs and non-CRM sequence to predict novel CRMs in regulatory systems. On a data set of 16 gap and pair-rule genes containing 52 known CRMs, predictions made by HexDiff had a higher correlation with the known CRMs than several existing CRM prediction algorithms: Ahab, Cluster Buster, MSCAN, MCAST, and LWF. After combining the results of the different algorithms, 10 putative CRMs were identified and are strong candidates for future study. The hexamers used by HexDiff to distinguish between CRMs and non-CRM sequence were also analyzed and were shown to be enriched in regulatory elements. Conclusion HexDiff provides an efficient and effective means for finding new CRMs based on known CRMs, rather than known binding sites.

  9. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    Science.gov (United States)

    Brian J. Knaus; Richard Cronn; Aaron Liston; Kristine Pilgrim; Michael K. Schwartz

    2011-01-01

    Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the...

  10. Identification of distal regulatory regions in the human alpha IIb gene locus necessary for consistent, high-level megakaryocyte expression.

    Science.gov (United States)

    Thornton, Michael A; Zhang, Chunyan; Kowalska, Maria A; Poncz, Mortimer

    2002-11-15

    The alphaIIb/beta3-integrin receptor is present at high levels only in megakaryocytes and platelets. Its presence on platelets is critical for hemostasis. The tissue-specific nature of this receptor's expression is secondary to the restricted expression of alphaIIb, and studies of the alphaIIb proximal promoter have served as a model of a megakaryocyte-specific promoter. We have examined the alphaIIb gene locus for distal regulatory elements. Sequence comparison between the human (h) and murine (m) alphaIIb loci revealed high levels of conservation at intergenic regions both 5' and 3' to the alphaIIb gene. Additionally, deoxyribonuclease (DNase) I sensitivity mapping defined tissue-specific hypersensitive (HS) sites that coincide, in part, with these conserved regions. Transgenic mice containing various lengths of the h(alpha)IIb gene locus, which included or excluded the various conserved/HS regions, demonstrated that the proximal promoter was sufficient for tissue specificity, but that a region 2.5 to 7.1 kb upstream of the h(alpha)IIb gene was necessary for consistent expression. Another region 2.2 to 7.4 kb downstream of the gene enhanced expression 1000-fold and led to levels of h(alpha)IIb mRNA that were about 30% of the native m(alpha)IIb mRNA level. These constructs also resulted in detectable h(alpha)IIb/m(beta)3 on the platelet surface. This work not only confirms the importance of the proximal promoter of the alphaIIb gene for tissue specificity, but also characterizes the distal organization of the alphaIIb gene locus and provides an initial localization of 2 important regulatory regions needed for the expression of the alphaIIb gene at high levels during megakaryopoiesis.

  11. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Directory of Open Access Journals (Sweden)

    Graner Andreas

    2008-10-01

    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  12. Structural classification of endogenous regulatory oligopeptides.

    Science.gov (United States)

    Zamyatnin, A A

    1991-07-01

    Based on the criteria of 50% identity in the amino acid sequence, a new method for grouping endogenous regulatory oligopeptides into structural families is presented. Data from the EROP-Moscow data bank on 579 oligopeptides fitting a preset spectrum of functional activities revealed 73 structural oligopeptide groups, 36 of which were called families.

  13. 77 FR 7968 - Semiannual Regulatory Agenda

    Science.gov (United States)

    2012-02-13

    ... Regulation Sequence No. Title Identifier No. 392 Non-Federal Oil and Gas 1024-AD78 Rights. National Park.... Timetable: Action Date FR Cite NPRM 07/00/12 Regulatory Flexibility Analysis Required: Yes. Agency Contact... anaconda, and Beni anaconda. Timetable: Action Date FR Cite ANPRM 01/31/08 73 FR 5784 ANPRM Comment Period...

  14. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    Directory of Open Access Journals (Sweden)

    Apurva Barve

    2013-01-01

    Full Text Available Xeroderma pigmentosum group A (XPA is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1 and replication protein A 70 kDa subunit (RPA70 proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  15. Genome-wide discovery of novel and conserved microRNAs in white shrimp (Litopenaeus vannamei).

    Science.gov (United States)

    Xi, Qian-Yun; Xiong, Yuan-Yan; Wang, Yuan-Mei; Cheng, Xiao; Qi, Qi-En; Shu, Gang; Wang, Song-Bo; Wang, Li-Na; Gao, Ping; Zhu, Xiao-Tong; Jiang, Qing-Yan; Zhang, Yong-Liang; Liu, Li

    2015-01-01

    Of late years, a large amount of conserved and species-specific microRNAs (miRNAs) have been performed on identification from species which are economically important but lack a full genome sequence. In this study, Solexa deep sequencing and cross-species miRNA microarray were used to detect miRNAs in white shrimp. We identified 239 conserved miRNAs, 14 miRNA* sequences and 20 novel miRNAs by bioinformatics analysis from 7,561,406 high-quality reads representing 325,370 distinct sequences. The all 20 novel miRNAs were species-specific in white shrimp and not homologous in other species. Using the conserved miRNAs from the miRBase database as a query set to search for homologs from shrimp expressed sequence tags (ESTs), 32 conserved computationally predicted miRNAs were discovered in shrimp. In addition, using microarray analysis in the shrimp fed with Panax ginseng polysaccharide complex, 151 conserved miRNAs were identified, 18 of which were significant up-expression, while 49 miRNAs were significant down-expression. In particular, qRT-PCR analysis was also performed for nine miRNAs in three shrimp tissues such as muscle, gill and hepatopancreas. Results showed that these miRNAs expression are tissue specific. Combining results of the three methods, we detected 20 novel and 394 conserved miRNAs. Verification with quantitative reverse transcription (qRT-PCR) and Northern blot showed a high confidentiality of data. The study provides the first comprehensive specific miRNA profile of white shrimp, which includes useful information for future investigations into the function of miRNAs in regulation of shrimp development and immunology.

  16. Computational Analysis of an Evolutionarily Conserved VertebrateMuscle Alternative Splicing Program

    Energy Technology Data Exchange (ETDEWEB)

    Das, Debopriya; Clark, Tyson A.; Schweitzer, Anthony; Marr,Henry; Yamamoto, Miki L.; Parra, Marilyn K.; Arribere, Josh; Minovitsky,Simon; Dubchak, Inna; Blume, John E.; Conboy, John G.

    2006-06-15

    A novel exon microarray format that probes gene expression with single exon resolution was employed to elucidate critical features of a vertebrate muscle alternative splicing program. A dataset of 56 microarray-defined, muscle-enriched exons and their flanking introns were examined computationally in order to investigate coordination of the muscle splicing program. Candidate intron regulatory motifs were required to meet several stringent criteria: significant over-representation near muscle-enriched exons, correlation with muscle expression, and phylogenetic conservation among genomes of several vertebrate orders. Three classes of regulatory motifs were identified in the proximal downstream intron, within 200nt of the target exons: UGCAUG, a specific binding site for Fox-1 related splicing factors; ACUAAC, a novel branchpoint-like element; and UG-/UGC-rich elements characteristic of binding sites for CELF splicing factors. UGCAUG was remarkably enriched, being present in nearly one-half of all cases. These studies suggest that Fox and CELF splicing factors play a major role in enforcing the muscle-specific alternative splicing program, facilitating expression of a set of unique isoforms of cytoskeletal proteins that are critical to muscle cell differentiation. Supplementary materials: There are four supplementary tables and one supplementary figure. The tables provide additional detailed information concerning the muscle-enriched datasets, and about over-represented oligonucleotide sequences in the flanking introns. The supplementary figure shows RT-PCR data confirming the muscle-enriched expression of exons predicted from the microarray analysis.

  17. Regulatory review of probabilistic safety assessment (PSA) level 1

    International Nuclear Information System (INIS)

    2000-02-01

    Probabilistic safety assessment (PSA) is increasingly being used as part of the decision making process to assess the level of safety of nuclear power plants. The methodologies in use are maturing and the insights gained from the PSAs are being used along with those from the deterministic analysis. Many regulatory authorities consider that the current state of the art in PSA (especially Level 1 PSA) is sufficiently well developed that it can be used centrally in the regulatory decision making process - referred to as 'risk informed regulation'. For these applications to be successful, it will be necessary for regulatory authorities to have a high degree of confidence in PSA. However, at the IAEA Technical Committee Meeting on Use of PSA in the Regulatory Process in 1994 and at the OECD Nuclear Energy Agency Committee for Nuclear Regulatory Activities (CNRA) 'Special Issues' Meeting in 1997 on Review Procedures and Criteria for Different Regulatory Applications of PSA, it was recognized that formal regulatory review guidance for PSA did not exist. The senior regulators noted that there was a need to produce some international guidance for reviewing PSAs to establish an agreed basis for assessing whether important technological and methodological issues in PSAs are treated adequately and to verify that conclusions reached are appropriate. In 1997 the IAEA and OECD Nuclear Energy Agency agreed to produce in co-operation a technical document on the regulatory review of PSA. This publication is intended to provide guidance to regulatory authorities on how to review the PSA for a nuclear power plant to gain confidence that it has been carried out to an acceptable standard so that it can be used as the basis for taking risk informed decisions within a regulatory decision making process. The document gives guidance on how to set about reviewing a PSA and on the technical issues that need to be addressed. This publication gives guidance for the review of Level 1 PSA for

  18. Identification of microRNAs from Amur grape (Vitis amurensis Rupr.) by deep sequencing and analysis of microRNA variations with bioinformatics.

    Science.gov (United States)

    Wang, Chen; Han, Jian; Liu, Chonghuai; Kibet, Korir Nicholas; Kayesh, Emrul; Shangguan, Lingfei; Li, Xiaoying; Fang, Jinggui

    2012-03-29

    MicroRNA (miRNA) is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr.) is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs) from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR) analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Deep sequencing of short RNAs from Amur grape flowers and berries identified 72 new potential miRNAs and 34 known but non-conserved mi

  19. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  20. Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences | Center for Cancer Research

    Science.gov (United States)

    A graphical method is presented for displaying how binding proteins and other macromolecules interact with individual bases of nucleotide sequences. Characters representing the sequence are either oriented normally and placed above a line indicating favorable contact, or upside-down and placed below the line indicating unfavorable contact. The positive or negative height of each letter shows the contribution of that base to the average sequence conservation of the binding site, as represented by a sequence logo.

  1. Systematic identification of regulatory variants associated with cancer risk.

    Science.gov (United States)

    Liu, Song; Liu, Yuwen; Zhang, Qin; Wu, Jiayu; Liang, Junbo; Yu, Shan; Wei, Gong-Hong; White, Kevin P; Wang, Xiaoyue

    2017-10-23

    Most cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts. To systematically identify the SNPs that affect gene expression by modulating activities of distal regulatory elements, we adapt the self-transcribing active regulatory region sequencing (STARR-seq) strategy, a high-throughput technique to functionally quantify enhancer activities. From 10,673 SNPs linked with 996 cancer risk-associated SNPs identified in previous GWAS studies, we identify 575 SNPs in the fragments that positively regulate gene expression, and 758 SNPs in the fragments with negative regulatory activities. Among them, 70 variants are regulatory variants for which the two alleles confer different regulatory activities. We analyze in depth two regulatory variants-breast cancer risk SNP rs11055880 and leukemia risk-associated SNP rs12142375-and demonstrate their endogenous regulatory activities on expression of ATF7IP and PDE4B genes, respectively, using a CRISPR-Cas9 approach. By identifying regulatory variants associated with cancer susceptibility and studying their molecular functions, we hope to help the interpretation of GWAS results and provide improved information for cancer risk assessment.

  2. Regulatory heterochronies and loose temporal scaling between sea star and sea urchin regulatory circuits.

    Science.gov (United States)

    Gildor, Tsvia; Hinman, Veronica; Ben-Tabou-De-Leon, Smadar

    2017-01-01

    It has long been argued that heterochrony, a change in relative timing of a developmental process, is a major source of evolutionary innovation. Heterochronic changes of regulatory gene activation could be the underlying molecular mechanism driving heterochronic changes through evolution. Here, we compare the temporal expression profiles of key regulatory circuits between sea urchin and sea star, representative of two classes of Echinoderms that shared a common ancestor about 500 million years ago. The morphologies of the sea urchin and sea star embryos are largely comparable, yet, differences in certain mesodermal cell types and ectodermal patterning result in distinct larval body plans. We generated high resolution temporal profiles of 17 mesodermally-, endodermally- and ectodermally-expressed regulatory genes in the sea star, Patiria miniata, and compared these to their orthologs in the Mediterranean sea urchin, Paracentrotus lividus. We found that the maternal to zygotic transition is delayed in the sea star compared to the sea urchin, in agreement with the longer cleavage stage in the sea star. Interestingly, the order of gene activation shows the highest variation in the relatively diverged mesodermal circuit, while the correlations of expression dynamics are the highest in the strongly conserved endodermal circuit. We detected loose scaling of the developmental rates of these species and observed interspecies heterochronies within all studied regulatory circuits. Thus, after 500 million years of parallel evolution, mild heterochronies between the species are frequently observed and the tight temporal scaling observed for closely related species no longer holds.

  3. Highly accessible AU-rich regions in 3’ untranslated regions are hotspots for binding of regulatory factors

    Science.gov (United States)

    2017-01-01

    Post-transcriptional regulation is regarded as one of the major processes involved in the regulation of gene expression. It is mainly performed by RNA binding proteins and microRNAs, which target RNAs and typically affect their stability. Recent efforts from the scientific community have aimed at understanding post-transcriptional regulation at a global scale by using high-throughput sequencing techniques such as cross-linking and immunoprecipitation (CLIP), which facilitates identification of binding sites of these regulatory factors. However, the diversity in the experimental procedures and bioinformatics analyses has hindered the integration of multiple datasets and thus limited the development of an integrated view of post-transcriptional regulation. In this work, we have performed a comprehensive analysis of 107 CLIP datasets from 49 different RBPs in HEK293 cells to shed light on the complex interactions that govern post-transcriptional regulation. By developing a more stringent CLIP analysis pipeline we have discovered the existence of conserved regulatory AU-rich regions in the 3’UTRs where miRNAs and RBPs that regulate several processes such as polyadenylation or mRNA stability bind. Analogous to promoters, many factors have binding sites overlapping or in close proximity in these hotspots and hence the regulation of the mRNA may depend on their relative concentrations. This hypothesis is supported by RBP knockdown experiments that alter the relative concentration of RBPs in the cell. Upon AGO2 knockdown (KD), transcripts containing “free” target sites show increased expression levels compared to those containing target sites in hotspots, which suggests that target sites within hotspots are less available for miRNAs to bind. Interestingly, these hotspots appear enriched in genes with regulatory functions such as DNA binding and RNA binding. Taken together, our results suggest that hotspots are functional regulatory elements that define an extra layer

  4. Transcription factor trapping by RNA in gene regulatory elements.

    Science.gov (United States)

    Sigova, Alla A; Abraham, Brian J; Ji, Xiong; Molinie, Benoit; Hannett, Nancy M; Guo, Yang Eric; Jangi, Mohini; Giallourakis, Cosmas C; Sharp, Phillip A; Young, Richard A

    2015-11-20

    Transcription factors (TFs) bind specific sequences in promoter-proximal and -distal DNA elements to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF Yin-Yang 1 (YY1) binds to both gene regulatory elements and their associated RNA species across the entire genome. Reduced transcription of regulatory elements diminishes YY1 occupancy, whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive-feedback loop that contributes to the stability of gene expression programs. Copyright © 2015, American Association for the Advancement of Science.

  5. Sequence of human protamine 2 cDNA

    Energy Technology Data Exchange (ETDEWEB)

    Domenjoud, L; Fronia, C; Uhde, F; Engel, W [Universitaet Goettingen (West Germany)

    1988-08-11

    The authors report the cloning and sequencing of a cDNA clone for human protamine 2 (hp2), isolated from a human testis cDNA library cloned in the vector {lambda}-gt11. A 66mer oligonucleotide, that corresponds to an amino acid sequence which is highly conserved between hp2 and mouse protamine 2 (mp2) served as hybridization probe. The homology between the amino acid sequence deduced from our cDNA and the published amino acid sequence for hp2 is 100%.

  6. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  7. Radiation and the regulatory landscape of neo{sup 2}-Darwinism

    Energy Technology Data Exchange (ETDEWEB)

    Rollo, C. David [Department of Biology, Life Sciences Building, 1280 Main St. West, Hamilton, Ont., Canada L8S 4K1 (Canada)]. E-mail: rollocd@mcmaster.ca

    2006-05-11

    Several recently revealed features of eukaryotic genomes were not predicted by earlier evolutionary paradigms, including the relatively small number of genes, the very large amounts of non-functional code and its quarantine in heterochromatin, the remarkable conservation of many functionally important genes across relatively enormous phylogenetic distances, and the prevalence of extra-genomic information associated with chromatin structure and histone proteins. All of these emphasize a paramount role for regulatory evolution, which is further reinforced by recent perspectives highlighting even higher-order regulation governing epigenetics and development (EVO-DEVO). Modern neo{sup 2}-Darwinism, with its emphasis on regulatory mechanisms and regulatory evolution provides new vision for understanding radiation biology, particularly because free radicals and redox states are central to many regulatory mechanisms and free radicals generated by radiation mimic and amplify endogenous signalling. This paper explores some of these aspects and their implications for low-dose radiation biology.

  8. Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

    Science.gov (United States)

    Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

    2005-07-15

    A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.

  9. Evidence for deep regulatory similarities in early developmental programs across highly diverged insects.

    Science.gov (United States)

    Kazemian, Majid; Suryamohan, Kushal; Chen, Jia-Yu; Zhang, Yinan; Samee, Md Abul Hassan; Halfon, Marc S; Sinha, Saurabh

    2014-09-01

    Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene

  10. Evidence for Deep Regulatory Similarities in Early Developmental Programs across Highly Diverged Insects

    Science.gov (United States)

    Zhang, Yinan; Samee, Md. Abul Hassan; Halfon, Marc S.; Sinha, Saurabh

    2014-01-01

    Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like “long germband” development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250–350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as “training data” to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary

  11. High-Throughput Sequencing Reveals Hypothalamic MicroRNAs as Novel Partners Involved in Timing the Rapid Development of Chicken (Gallus gallus) Gonads.

    Science.gov (United States)

    Han, Wei; Zou, Jianmin; Wang, Kehua; Su, Yijun; Zhu, Yunfen; Song, Chi; Li, Guohui; Qu, Liang; Zhang, Huiyong; Liu, Honglin

    2015-01-01

    Onset of the rapid gonad growth is a milestone in sexual development that comprises many genes and regulatory factors. The observations in model organisms and mammals including humans have shown a potential link between miRNAs and development timing. To determine whether miRNAs play roles in this process in the chicken (Gallus gallus), the Solexa deep sequencing was performed to analyze the profiles of miRNA expression in the hypothalamus of hens from two different pubertal stages, before onset of the rapid gonad development (BO) and after onset of the rapid gonad development (AO). 374 conserved and 46 novel miRNAs were identified as hypothalamus-expressed miRNAs in the chicken. 144 conserved miRNAs were showed to be differentially expressed (reads > 10, P time quantitative RT-PCR (qRT-PCR) method. 2013 putative genes were predicted as the targets of the 15 most differentially expressed miRNAs (fold-change > 4.0, P times by the miRNAs. qRT-PCR revealed the basic transcription levels of these clock genes were much higher (P development of chicken gonads. Considering the characteristics of miRNA functional conservation, the results will contribute to the research on puberty onset in humans.

  12. Evaluation of the conserve flavin reductase gene from three ...

    African Journals Online (AJOL)

    STORAGESEVER

    2009-12-15

    Dec 15, 2009 ... means of PCR technique. The nucleic acid sequences of the PCR primers were designed using conserved nucleic acid sequences of the flavin reductase enzyme from. Rhodococcus sp. strain IGTS8. The oligonucleotide primers were as follows: 5'-GAA TTC ATG TCT GAC. AAG CCG AAT GCC-3' (forward) ...

  13. H-2RIIBP, a member of the nuclear hormone receptor superfamily that binds to both the regulatory element of major histocompatibility class I genes and the estrogen response element.

    Science.gov (United States)

    Hamada, K; Gleason, S L; Levi, B Z; Hirschfeld, S; Appella, E; Ozato, K

    1989-11-01

    Transcription of major histocompatibility complex (MHC) class I genes is regulated by the conserved MHC class I regulatory element (CRE). The CRE has two factor-binding sites, region I and region II, both of which elicit enhancer function. By screening a mouse lambda gt 11 library with the CRE as a probe, we isolated a cDNA clone that encodes a protein capable of binding to region II of the CRE. This protein, H-2RIIBP (H-2 region II binding protein), bound to the native region II sequence, but not to other MHC cis-acting sequences or to mutant region II sequences, similar to the naturally occurring region II factor in mouse cells. The deduced amino acid sequence of H-2RIIBP revealed two putative zinc fingers homologous to the DNA-binding domain of steroid/thyroid hormone receptors. Although sequence similarity in other regions was minimal, H-2RIIBP has apparent modular domains characteristic of the nuclear hormone receptors. Further analyses showed that both H-2RIIBP and the natural region II factor bind to the estrogen response element (ERE) of the vitellogenin A2 gene. The ERE is composed of a palindrome, and half of this palindrome resembles the region II binding site of the MHC CRE. These results indicate that H-2RIIBP (i) is a member of the superfamily of nuclear hormone receptors and (ii) may regulate not only MHC class I genes but also genes containing the ERE and related sequences. Sequences homologous to the H-2RIIBP gene are widely conserved in the animal kingdom. H-2RIIBP mRNA is expressed in many mouse tissues, in agreement with the distribution of the natural region II factor.

  14. Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

    International Nuclear Information System (INIS)

    Bhattacharya, Monolekha; Das, Amit Kumar

    2011-01-01

    Highlights: ► The regulatory sequences recognized by TcrX have been identified. ► The regulatory region comprises of inverted repeats segregated by 30 bp region. ► The mode of binding of TcrX with regulatory sequence is unique. ► In silico TcrX–DNA docked model binds one of the inverted repeats. ► Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has not been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by ∼30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.

  15. Evidence for widespread degradation of gene control regions in hominid genomes.

    Directory of Open Access Journals (Sweden)

    Peter D Keightley

    2005-02-01

    Full Text Available Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human-chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees.

  16. Analysis of tomato plasma membrane H(+)-ATPase gene family suggests a mycorrhiza-mediated regulatory mechanism conserved in diverse plant species.

    Science.gov (United States)

    Liu, Junli; Liu, Jianjian; Chen, Aiqun; Ji, Minjie; Chen, Jiadong; Yang, Xiaofeng; Gu, Mian; Qu, Hongye; Xu, Guohua

    2016-10-01

    In plants, the plasma membrane H(+)-ATPase (HA) is considered to play a crucial role in regulating plant growth and respoding to environment stresses. Multiple paralogous genes encoding different isozymes of HA have been identified and characterized in several model plants, while limited information of the HA gene family is available to date for tomato. Here, we describe the molecular and expression features of eight HA-encoding genes (SlHA1-8) from tomato. All these genes are interrupted by multiple introns with conserved positions. SlHA1, 2, and 4 were widely expressed in all tissues, while SlHA5, 6, and 7 were almost only expressed in flowers. SlHA8, the transcripts of which were barely detectable under normal or nutrient-/salt-stress growth conditions, was strongly activated in arbuscular mycorrhizal (AM) fungal-colonized roots. Extreme lack of SlHA8 expression in M161, a mutant defective to AM fungal colonization, provided genetic evidence towards the dependence of its expression on AM symbiosis. A 1521-bp SlHA8 promoter could direct the GUS reporter expression specifically in colonized cells of transgenic tobacco, soybean, and rice mycorrhizal roots. Promoter deletion assay revealed a 223-bp promoter fragment of SlHA8 containing a variant of AM-specific cis-element MYCS (vMYCS) sufficient to confer the AM-induced activity. Targeted deletion of this motif in the corresponding promoter region causes complete abolishment of GUS staining in mycorrhizal roots. Together, these results lend cogent evidence towards the evolutionary conservation of a potential regulatory mechanism mediating the activation of AM-responsive HA genes in diverse mycorrhizal plant species.

  17. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes

    International Nuclear Information System (INIS)

    Wang, Xuting; Tomso, Daniel J.; Liu Xuemei; Bell, Douglas A.

    2005-01-01

    Single nucleotide polymorphisms (SNPs) in the human genome are DNA sequence variations that can alter an individual's response to environmental exposure. SNPs in gene coding regions can lead to changes in the biological properties of the encoded protein. In contrast, SNPs in non-coding gene regulatory regions may affect gene expression levels in an allele-specific manner, and these functional polymorphisms represent an important but relatively unexplored class of genetic variation. The main challenge in analyzing these SNPs is a lack of robust computational and experimental methods. Here, we first outline mechanisms by which genetic variation can impact gene regulation, and review recent findings in this area; then, we describe a methodology for bioinformatic discovery and functional analysis of regulatory SNPs in cis-regulatory regions using the assembled human genome sequence and databases on sequence polymorphism and gene expression. Our method integrates SNP and gene databases and uses a set of computer programs that allow us to: (1) select SNPs, from among the >9 million human SNPs in the NCBI dbSNP database, that are similar to cis-regulatory element (RE) consensus sequences; (2) map the selected dbSNP entries to the human genome assembly in order to identify polymorphic REs near gene start sites; (3) prioritize the candidate polymorphic RE containing genes by searching the existing genotype and gene expression data sets. The applicability of this system has been demonstrated through studies on p53 responsive elements and is being extended to additional pathways and environmentally responsive genes

  18. A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

    Science.gov (United States)

    Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

    2013-07-18

    Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.

  19. Perception Enhancement using Visual Attributes in Sequence Motif Visualization

    OpenAIRE

    Oon, Yin; Lee, Nung; Kok, Wei

    2016-01-01

    Sequence logo is a well-accepted scientific method to visualize the conservation characteristics of biological sequence motifs. Previous studies found that using sequence logo graphical representation for scientific evidence reports or arguments could seriously cause biases and misinterpretation by users. This study investigates on the visual attributes performance of a sequence logo in helping users to perceive and interpret the information based on preattentive theories and Gestalt principl...

  20. A Sequence and Structure Based Method to Predict Putative Substrates, Functions and Regulatory Networks of Endo Proteases

    Science.gov (United States)

    Venkatraman, Prasanna; Balakrishnan, Satish; Rao, Shashidhar; Hooda, Yogesh; Pol, Suyog

    2009-01-01

    Background Proteases play a central role in cellular homeostasis and are responsible for the spatio- temporal regulation of function. Many putative proteases have been recently identified through genomic approaches, leading to a surge in global profiling attempts to characterize their function. Through such efforts and others it has become evident that many proteases play non-traditional roles. Accordingly, the number and the variety of the substrate repertoire of proteases are expected to be much larger than previously assumed. In line with such global profiling attempts, we present here a method for the prediction of natural substrates of endo proteases (human proteases used as an example) by employing short peptide sequences as specificity determinants. Methodology/Principal Findings Our method incorporates specificity determinants unique to individual enzymes and physiologically relevant dual filters namely, solvent accessible surface area-a parameter dependent on protein three-dimensional structure and subcellular localization. By incorporating such hitherto unused principles in prediction methods, a novel ligand docking strategy to mimic substrate binding at the active site of the enzyme, and GO functions, we identify and perform subjective validation on putative substrates of matriptase and highlight new functions of the enzyme. Using relative solvent accessibility to rank order we show how new protease regulatory networks and enzyme cascades can be created. Conclusion We believe that our physiologically relevant computational approach would be a very useful complementary method in the current day attempts to profile proteases (endo proteases in particular) and their substrates. In addition, by using functional annotations, we have demonstrated how normal and unknown functions of a protease can be envisaged. We have developed a network which can be integrated to create a proteolytic world. This network can in turn be extended to integrate other regulatory

  1. De novo prediction of structured RNAs from genomic sequences

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Þórarinsson, Elfar

    2010-01-01

    currently available, because evolutionary conservation highlights functionally important regions. Conserved secondary structure, rather than primary sequence, is the hallmark of many functionally important RNAs, because compensatory substitutions in base-paired regions preserve structure. Unfortunately...

  2. Repertoire of bovine miRNA and miRNA-like small regulatory RNAs expressed upon viral infection.

    Directory of Open Access Journals (Sweden)

    Evgeny A Glazov

    Full Text Available MicroRNA (miRNA and other types of small regulatory RNAs play a crucial role in the regulation of gene expression in eukaryotes. Several distinct classes of small regulatory RNAs have been discovered in recent years. To extend the repertoire of small RNAs characterized in mammals and to examine relationship between host miRNA expression and viral infection we used Illumina's ultrahigh throughput sequencing approach. We sequenced three small RNA libraries prepared from cell line derived from the adult bovine kidney under normal conditions and upon infection of the cell line with Bovine herpesvirus 1. We used a bioinformatics approach to distinguish authentic mature miRNA sequences from other classes of small RNAs and short RNA fragments represented in the sequencing data. Using this approach we detected 219 out of 356 known bovine miRNAs and 115 respective miRNA* sequences. In addition we identified five new bovine orthologs of known mammalian miRNAs and discovered 268 new cow miRNAs many of which are not identifiable in other mammalian genomes and thus might be specific to the ruminant lineage. In addition we found seven new bovine mirtron candidates. We also discovered 10 small nucleolar RNA (snoRNA loci that give rise to small RNA with possible miRNA-like function. Results presented in this study extend our knowledge of the biology and evolution of small regulatory RNAs in mammals and illuminate mechanisms of small RNA biogenesis and function. New miRNA sequences and the original sequencing data have been submitted to miRNA repository (miRBase and NCBI GEO archive respectively. We envisage that these resources will facilitate functional annotation of the bovine genome and promote further functional and comparative genomics studies of small regulatory RNA in mammals.

  3. Case studies in residual use and energy conservation at wastewater treatment plants

    Energy Technology Data Exchange (ETDEWEB)

    Stewart, D. [Science Applications International Corp., Los Altos, CA (United States)

    1995-06-01

    The US Environmental Protection Agency (EPA) and the National Renewable Energy Laboratory (NREL) for the US Department of Energy (DOE) funded a study to document energy conservation activities and their effects on operation costs, regulatory compliance, and process optimization at several wastewater treatment plants (WWTPS). The purpose of this report is to review the efforts of wastewater treatment Facilities that use residuals as fuels. Case histories are presented for facilities that have taken measures to reduce energy consumption during wastewater treatment. Most of the WWTPs discussed in this report have retrofitted existing facilities to achieve energy conservation. The case studies of energy conservation measures found no effects on the facilities` ability to comply with NPDES permits. Indeed, energy conservation activities enhance environmental compliance in several ways.

  4. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

    Science.gov (United States)

    Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

    2018-05-31

    In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.

  5. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    Science.gov (United States)

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  6. Impacts of Neanderthal-Introgressed Sequences on the Landscape of Human Gene Expression.

    Science.gov (United States)

    McCoy, Rajiv C; Wakefield, Jon; Akey, Joshua M

    2017-02-23

    Regulatory variation influencing gene expression is a key contributor to phenotypic diversity, both within and between species. Unfortunately, RNA degrades too rapidly to be recovered from fossil remains, limiting functional genomic insights about our extinct hominin relatives. Many Neanderthal sequences survive in modern humans due to ancient hybridization, providing an opportunity to assess their contributions to transcriptional variation and to test hypotheses about regulatory evolution. We developed a flexible Bayesian statistical approach to quantify allele-specific expression (ASE) in complex RNA-seq datasets. We identified widespread expression differences between Neanderthal and modern human alleles, indicating pervasive cis-regulatory impacts of introgression. Brain regions and testes exhibited significant downregulation of Neanderthal alleles relative to other tissues, consistent with natural selection influencing the tissue-specific regulatory landscape. Our study demonstrates that Neanderthal-inherited sequences are not silent remnants of ancient interbreeding but have measurable impacts on gene expression that contribute to variation in modern human phenotypes. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Barcoded DNA-tag reporters for multiplex cis-regulatory analysis.

    Directory of Open Access Journals (Sweden)

    Jongmin Nam

    Full Text Available Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of "barcoded" DNA-tag reporters, "Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs. The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics.

  8. Conserved gene regulatory module specifies lateral neural borders across bilaterians.

    Science.gov (United States)

    Li, Yongbin; Zhao, Di; Horie, Takeo; Chen, Geng; Bao, Hongcun; Chen, Siyu; Liu, Weihong; Horie, Ryoko; Liang, Tao; Dong, Biyu; Feng, Qianqian; Tao, Qinghua; Liu, Xiao

    2017-08-01

    The lateral neural plate border (NPB), the neural part of the vertebrate neural border, is composed of central nervous system (CNS) progenitors and peripheral nervous system (PNS) progenitors. In invertebrates, PNS progenitors are also juxtaposed to the lateral boundary of the CNS. Whether there are conserved molecular mechanisms determining vertebrate and invertebrate lateral neural borders remains unclear. Using single-cell-resolution gene-expression profiling and genetic analysis, we present evidence that orthologs of the NPB specification module specify the invertebrate lateral neural border, which is composed of CNS and PNS progenitors. First, like in vertebrates, the conserved neuroectoderm lateral border specifier Msx/vab-15 specifies lateral neuroblasts in Caenorhabditis elegans Second, orthologs of the vertebrate NPB specification module ( Msx/vab-15 , Pax3/7/pax-3 , and Zic/ref-2 ) are significantly enriched in worm lateral neuroblasts. In addition, like in other bilaterians, the expression domain of Msx/vab-15 is more lateral than those of Pax3/7/pax-3 and Zic/ref- 2 in C. elegans Third, we show that Msx/vab-15 regulates the development of mechanosensory neurons derived from lateral neural progenitors in multiple invertebrate species, including C. elegans , Drosophila melanogaster , and Ciona intestinalis We also identify a novel lateral neural border specifier, ZNF703/tlp-1 , which functions synergistically with Msx/vab- 15 in both C. elegans and Xenopus laevis These data suggest a common origin of the molecular mechanism specifying lateral neural borders across bilaterians.

  9. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  10. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    Science.gov (United States)

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  11. DREISS: Using State-Space Models to Infer the Dynamics of Gene Expression Driven by External and Internal Regulatory Networks

    Science.gov (United States)

    Gerstein, Mark

    2016-01-01

    Gene expression is controlled by the combinatorial effects of regulatory factors from different biological subsystems such as general transcription factors (TFs), cellular growth factors and microRNAs. A subsystem’s gene expression may be controlled by its internal regulatory factors, exclusively, or by external subsystems, or by both. It is thus useful to distinguish the degree to which a subsystem is regulated internally or externally–e.g., how non-conserved, species-specific TFs affect the expression of conserved, cross-species genes during evolution. We developed a computational method (DREISS, dreiss.gerteinlab.org) for analyzing the Dynamics of gene expression driven by Regulatory networks, both External and Internal based on State Space models. Given a subsystem, the “state” and “control” in the model refer to its own (internal) and another subsystem’s (external) gene expression levels. The state at a given time is determined by the state and control at a previous time. Because typical time-series data do not have enough samples to fully estimate the model’s parameters, DREISS uses dimensionality reduction, and identifies canonical temporal expression trajectories (e.g., degradation, growth and oscillation) representing the regulatory effects emanating from various subsystems. To demonstrate capabilities of DREISS, we study the regulatory effects of evolutionarily conserved vs. divergent TFs across distant species. In particular, we applied DREISS to the time-series gene expression datasets of C. elegans and D. melanogaster during their embryonic development. We analyzed the expression dynamics of the conserved, orthologous genes (orthologs), seeing the degree to which these can be accounted for by orthologous (internal) versus species-specific (external) TFs. We found that between two species, the orthologs have matched, internally driven expression patterns but very different externally driven ones. This is particularly true for genes with

  12. DREISS: Using State-Space Models to Infer the Dynamics of Gene Expression Driven by External and Internal Regulatory Networks.

    Directory of Open Access Journals (Sweden)

    Daifeng Wang

    2016-10-01

    Full Text Available Gene expression is controlled by the combinatorial effects of regulatory factors from different biological subsystems such as general transcription factors (TFs, cellular growth factors and microRNAs. A subsystem's gene expression may be controlled by its internal regulatory factors, exclusively, or by external subsystems, or by both. It is thus useful to distinguish the degree to which a subsystem is regulated internally or externally-e.g., how non-conserved, species-specific TFs affect the expression of conserved, cross-species genes during evolution. We developed a computational method (DREISS, dreiss.gerteinlab.org for analyzing the Dynamics of gene expression driven by Regulatory networks, both External and Internal based on State Space models. Given a subsystem, the "state" and "control" in the model refer to its own (internal and another subsystem's (external gene expression levels. The state at a given time is determined by the state and control at a previous time. Because typical time-series data do not have enough samples to fully estimate the model's parameters, DREISS uses dimensionality reduction, and identifies canonical temporal expression trajectories (e.g., degradation, growth and oscillation representing the regulatory effects emanating from various subsystems. To demonstrate capabilities of DREISS, we study the regulatory effects of evolutionarily conserved vs. divergent TFs across distant species. In particular, we applied DREISS to the time-series gene expression datasets of C. elegans and D. melanogaster during their embryonic development. We analyzed the expression dynamics of the conserved, orthologous genes (orthologs, seeing the degree to which these can be accounted for by orthologous (internal versus species-specific (external TFs. We found that between two species, the orthologs have matched, internally driven expression patterns but very different externally driven ones. This is particularly true for genes with

  13. Regulatory agencies and regulatory risk

    OpenAIRE

    Knieps, Günter; Weiß, Hans-Jörg

    2008-01-01

    The aim of this paper is to show that regulatory risk is due to the discretionary behaviour of regulatory agencies, caused by a too extensive regulatory mandate provided by the legislator. The normative point of reference and a behavioural model of regulatory agencies based on the positive theory of regulation are presented. Regulatory risk with regard to the future behaviour of regulatory agencies is modelled as the consequence of the ex ante uncertainty about the relative influence of inter...

  14. Network perturbation by recurrent regulatory variants in cancer.

    Directory of Open Access Journals (Sweden)

    Kiwon Jang

    2017-03-01

    Full Text Available Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes.

  15. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel; Pugalenthi, Ganesan; Fedoroff, Nina V.

    2013-01-01

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  16. Identification and analysis of red sea mangrove (Avicennia marina microRNAs by high-throughput sequencing and their association with stress responses.

    Directory of Open Access Journals (Sweden)

    Basel Khraiwesh

    Full Text Available Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration.

  17. Identification and Analysis of Red Sea Mangrove (Avicennia marina) microRNAs by High-Throughput Sequencing and Their Association with Stress Responses

    KAUST Repository

    Khraiwesh, Basel

    2013-04-08

    Although RNA silencing has been studied primarily in model plants, advances in high-throughput sequencing technologies have enabled profiling of the small RNA components of many more plant species, providing insights into the ubiquity and conservatism of some miRNA-based regulatory mechanisms. Small RNAs of 20 to 24 nucleotides (nt) are important regulators of gene transcript levels by either transcriptional or by posttranscriptional gene silencing, contributing to genome maintenance and controlling a variety of developmental and physiological processes. Here, we used deep sequencing and molecular methods to create an inventory of the small RNAs in the mangrove species, Avicennia marina. We identified 26 novel mangrove miRNAs and 193 conserved miRNAs belonging to 36 families. We determined that 2 of the novel miRNAs were produced from known miRNA precursors and 4 were likely to be species-specific by the criterion that we found no homologs in other plant species. We used qRT-PCR to analyze the expression of miRNAs and their target genes in different tissue sets and some demonstrated tissue-specific expression. Furthermore, we predicted potential targets of these putative miRNAs based on a sequence homology and experimentally validated through endonucleolytic cleavage assays. Our results suggested that expression profiles of miRNAs and their predicted targets could be useful in exploring the significance of the conservation patterns of plants, particularly in response to abiotic stress. Because of their well-developed abilities in this regard, mangroves and other extremophiles are excellent models for such exploration. © 2013 Khraiwesh et al.

  18. Conserved upstream open reading frames in higher plants

    Directory of Open Access Journals (Sweden)

    Schultz Carolyn J

    2008-07-01

    Full Text Available Abstract Background Upstream open reading frames (uORFs can down-regulate the translation of the main open reading frame (mORF through two broad mechanisms: ribosomal stalling and reducing reinitiation efficiency. In distantly related plants, such as rice and Arabidopsis, it has been found that conserved uORFs are rare in these transcriptomes with approximately 100 loci. It is unclear how prevalent conserved uORFs are in closely related plants. Results We used a homology-based approach to identify conserved uORFs in five cereals (monocots that could potentially regulate translation. Our approach used a modified reciprocal best hit method to identify putative orthologous sequences that were then analysed by a comparative R-nomics program called uORFSCAN to find conserved uORFs. Conclusion This research identified new genes that may be controlled at the level of translation by conserved uORFs. We report that conserved uORFs are rare (

  19. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  20. Regulatory RNAs in Bacillus subtilis: a Gram-Positive Perspective on Bacterial RNA-Mediated Regulation of Gene Expression

    Science.gov (United States)

    Mars, Ruben A. T.; Nicolas, Pierre; Denham, Emma L.

    2016-01-01

    SUMMARY Bacteria can employ widely diverse RNA molecules to regulate their gene expression. Such molecules include trans-acting small regulatory RNAs, antisense RNAs, and a variety of transcriptional attenuation mechanisms in the 5′ untranslated region. Thus far, most regulatory RNA research has focused on Gram-negative bacteria, such as Escherichia coli and Salmonella. Hence, there is uncertainty about whether the resulting insights can be extrapolated directly to other bacteria, such as the Gram-positive soil bacterium Bacillus subtilis. A recent study identified 1,583 putative regulatory RNAs in B. subtilis, whose expression was assessed across 104 conditions. Here, we review the current understanding of RNA-based regulation in B. subtilis, and we categorize the newly identified putative regulatory RNAs on the basis of their conservation in other bacilli and the stability of their predicted secondary structures. Our present evaluation of the publicly available data indicates that RNA-mediated gene regulation in B. subtilis mostly involves elements at the 5′ ends of mRNA molecules. These can include 5′ secondary structure elements and metabolite-, tRNA-, or protein-binding sites. Importantly, sense-independent segments are identified as the most conserved and structured potential regulatory RNAs in B. subtilis. Altogether, the present survey provides many leads for the identification of new regulatory RNA functions in B. subtilis. PMID:27784798

  1. Regulatory RNAs in Bacillus subtilis: a Gram-Positive Perspective on Bacterial RNA-Mediated Regulation of Gene Expression.

    Science.gov (United States)

    Mars, Ruben A T; Nicolas, Pierre; Denham, Emma L; van Dijl, Jan Maarten

    2016-12-01

    Bacteria can employ widely diverse RNA molecules to regulate their gene expression. Such molecules include trans-acting small regulatory RNAs, antisense RNAs, and a variety of transcriptional attenuation mechanisms in the 5' untranslated region. Thus far, most regulatory RNA research has focused on Gram-negative bacteria, such as Escherichia coli and Salmonella. Hence, there is uncertainty about whether the resulting insights can be extrapolated directly to other bacteria, such as the Gram-positive soil bacterium Bacillus subtilis. A recent study identified 1,583 putative regulatory RNAs in B. subtilis, whose expression was assessed across 104 conditions. Here, we review the current understanding of RNA-based regulation in B. subtilis, and we categorize the newly identified putative regulatory RNAs on the basis of their conservation in other bacilli and the stability of their predicted secondary structures. Our present evaluation of the publicly available data indicates that RNA-mediated gene regulation in B. subtilis mostly involves elements at the 5' ends of mRNA molecules. These can include 5' secondary structure elements and metabolite-, tRNA-, or protein-binding sites. Importantly, sense-independent segments are identified as the most conserved and structured potential regulatory RNAs in B. subtilis. Altogether, the present survey provides many leads for the identification of new regulatory RNA functions in B. subtilis. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  2. Structure-sequence based analysis for identification of conserved regions in proteins

    Science.gov (United States)

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  3. Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

    Science.gov (United States)

    Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

    2013-08-01

    To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. Preliminary guidelines for electricity distributor conservation and demand management activities : a guide for conservation and demand management investment

    International Nuclear Information System (INIS)

    2004-01-01

    In May 2004, electricity distributors in Ontario were asked to submit deferral accounts to the Ontario Energy Board to track expenditures on conservation and demand management initiatives. The deferral accounts must be established before the distributor could recover the costs through the next installment of the allowable return on equity in March 2004. The Board will determine the appropriateness of the actual expenditures. These guidelines offer short-term assistance to distributors in establishing conservation and demand management plans and initiatives. The following specific measures may be supported by the Board: energy efficiency; operational changes to smart control systems; load management measures which facilitate interruptible and dispatchable loads, dual fuel applications, thermal storage and demand response; fuel switching measures; programs targeted to low income and hard to reach consumers; and, distributed energy options such as tri-generation, cogeneration, ground source heat pumps, wind and biomass systems. These guidelines described the regulatory treatment of conservation and demand management investments along with cost effectiveness, allocation of costs, monitoring, evaluation, and implementation. 1 appendix

  5. In Silico Characterization of Pectate Lyase Protein Sequences from Different Source Organisms

    Directory of Open Access Journals (Sweden)

    Amit Kumar Dubey

    2010-01-01

    Full Text Available A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis. The phylogenetic tree constructed revealed different clusters based on different source organisms representing bacterial, fungal, plant, and nematode pectate lyases. The multiple accessions of bacterial, fungal, nematode, and plant pectate lyase protein sequences were placed closely revealing a sequence level similarity. The multiple sequence alignment of these pectate lyase protein sequences from different source organisms showed conserved regions at different stretches with maximum homology from amino acid residues 439–467, 715–816, and 829–910 which could be used for designing degenerate primers or probes specific for pectate lyases. The motif analysis revealed a conserved Pec_Lyase_C domain uniformly observed in all pectate lyases irrespective of variable sources suggesting its possible role in structural and enzymatic functions.

  6. Suppressor mutations identify amino acids in PAA-1/PR65 that facilitate regulatory RSA-1/B″ subunit targeting of PP2A to centrosomes in C. elegans.

    Science.gov (United States)

    Lange, Karen I; Heinrichs, Jeffrey; Cheung, Karen; Srayko, Martin

    2013-01-15

    Protein phosphorylation and dephosphorylation is a key mechanism for the spatial and temporal regulation of many essential developmental processes and is especially prominent during mitosis. The multi-subunit protein phosphatase 2A (PP2A) enzyme plays an important, yet poorly characterized role in dephosphorylating proteins during mitosis. PP2As are heterotrimeric complexes comprising a catalytic, structural, and regulatory subunit. Regulatory subunits are mutually exclusive and determine subcellular localization and substrate specificity of PP2A. At least 3 different classes of regulatory subunits exist (termed B, B', B″) but there is no obvious similarity in primary sequence between these classes. Therefore, it is not known how these diverse regulatory subunits interact with the same holoenzyme to facilitate specific PP2A functions in vivo. The B″ family of regulatory subunits is the least understood because these proteins lack conserved structural domains. RSA-1 (regulator of spindle assembly) is a regulatory B″ subunit required for mitotic spindle assembly in Caenorhabditis elegans. In order to address how B″ subunits interact with the PP2A core enzyme, we focused on a conditional allele, rsa-1(or598ts), and determined that this mutation specifically disrupts the protein interaction between RSA-1 and the PP2A structural subunit, PAA-1. Through genetic screening, we identified a putative interface on the PAA-1 structural subunit that interacts with a defined region of RSA-1/B″. In the context of previously published results, these data propose a mechanism of how different PP2A B-regulatory subunit families can bind the same holoenzyme in a mutually exclusive manner, to perform specific tasks in vivo.

  7. Suppressor mutations identify amino acids in PAA-1/PR65 that facilitate regulatory RSA-1/B″ subunit targeting of PP2A to centrosomes in C. elegans

    Directory of Open Access Journals (Sweden)

    Karen I. Lange

    2012-11-01

    Protein phosphorylation and dephosphorylation is a key mechanism for the spatial and temporal regulation of many essential developmental processes and is especially prominent during mitosis. The multi-subunit protein phosphatase 2A (PP2A enzyme plays an important, yet poorly characterized role in dephosphorylating proteins during mitosis. PP2As are heterotrimeric complexes comprising a catalytic, structural, and regulatory subunit. Regulatory subunits are mutually exclusive and determine subcellular localization and substrate specificity of PP2A. At least 3 different classes of regulatory subunits exist (termed B, B′, B″ but there is no obvious similarity in primary sequence between these classes. Therefore, it is not known how these diverse regulatory subunits interact with the same holoenzyme to facilitate specific PP2A functions in vivo. The B″ family of regulatory subunits is the least understood because these proteins lack conserved structural domains. RSA-1 (regulator of spindle assembly is a regulatory B″ subunit required for mitotic spindle assembly in Caenorhabditis elegans. In order to address how B″ subunits interact with the PP2A core enzyme, we focused on a conditional allele, rsa-1(or598ts, and determined that this mutation specifically disrupts the protein interaction between RSA-1 and the PP2A structural subunit, PAA-1. Through genetic screening, we identified a putative interface on the PAA-1 structural subunit that interacts with a defined region of RSA-1/B″. In the context of previously published results, these data propose a mechanism of how different PP2A B-regulatory subunit families can bind the same holoenzyme in a mutually exclusive manner, to perform specific tasks in vivo.

  8. The migratory bird treaty and a century of waterfowl conservation

    Science.gov (United States)

    Anderson, Michael G.; Alisauskas, Ray T.; Batt, Bruce D. J.; Blohm, Robert J.; Higgins, Kenneth F.; Perry, Matthew; Ringelman, James K.; Sedinger, James S.; Serie, Jerome R.; Sharp, David E.; Trauger, David L.; Williams, Christopher K.

    2018-01-01

    In the final decades of the nineteenth century, concern was building about the status of migratory bird populations in North America. In this literature review, we describe how that concern led to a landmark conservation agreement in 1916, between the United States and Great Britain (on behalf of Canada) to conserve migratory birds shared by Canada and the United States. Drawing on published literature and our personal experience, we describe how subsequent enabling acts in both countries gave rise to efforts to better estimate population sizes and distributions, assess harvest rates and demographic impacts, design and fund landscape-level habitat conservation initiatives, and organize necessary political and regulatory processes. Executing these steps required large-scale thinking, unprecedented regional and international cooperation, ingenuity, and a commitment to scientific rigor and adaptive management. We applaud the conservation efforts begun 100 years ago with the Migratory Bird Treaty Convention. The agreement helped build the field of wildlife ecology and conservation in the twentieth century but only partially prepares us for the ecological and social challenges ahead. 

  9. Deep sequencing of Brachypodium small RNAs at the global genome level identifies microRNAs involved in cold stress response

    Directory of Open Access Journals (Sweden)

    Chong Kang

    2009-09-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are endogenous small RNAs having large-scale regulatory effects on plant development and stress responses. Extensive studies of miRNAs have only been performed in a few model plants. Although miRNAs are proved to be involved in plant cold stress responses, little is known for winter-habit monocots. Brachypodium distachyon, with close evolutionary relationship to cool-season cereals, has recently emerged as a novel model plant. There are few reports of Brachypodium miRNAs. Results High-throughput sequencing and whole-genome-wide data mining led to the identification of 27 conserved miRNAs, as well as 129 predicted miRNAs in Brachypodium. For multiple-member conserved miRNA families, their sizes in Brachypodium were much smaller than those in rice and Populus. The genome organization of miR395 family in Brachypodium was quite different from that in rice. The expression of 3 conserved miRNAs and 25 predicted miRNAs showed significant changes in response to cold stress. Among these miRNAs, some were cold-induced and some were cold-suppressed, but all the conserved miRNAs were up-regulated under cold stress condition. Conclusion Our results suggest that Brachypodium miRNAs are composed of a set of conserved miRNAs and a large proportion of non-conserved miRNAs with low expression levels. Both kinds of miRNAs were involved in cold stress response, but all the conserved miRNAs were up-regulated, implying an important role for cold-induced miRNAs. The different size and genome organization of miRNA families in Brachypodium and rice suggest that the frequency of duplication events or the selection pressure on duplicated miRNAs are different between these two closely related plant species.

  10. Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

    Science.gov (United States)

    Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

    2012-06-01

    The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. A Network Integration Approach to Predict Conserved Regulators Related to Pathogenicity of Influenza and SARS-CoV Respiratory Viruses

    Energy Technology Data Exchange (ETDEWEB)

    Mitchell, Hugh D.; Eisfeld, Amie J.; Sims, Amy; McDermott, Jason E.; Matzke, Melissa M.; Webb-Robertson, Bobbie-Jo M.; Tilton, Susan C.; Tchitchek, Nicholas; Josset, Laurence; Li, Chengjun; Ellis, Amy L.; Chang, Jean H.; Heegel, Robert A.; Luna, Maria L.; Schepmoes, Athena A.; Shukla, Anil K.; Metz, Thomas O.; Neumann, Gabriele; Benecke, Arndt; Smith, Richard D.; Baric, Ralph; Kawaoka, Yoshihiro; Katze, Michael G.; Waters, Katrina M.

    2013-07-25

    Respiratory infections stemming from influenza viruses and the Severe Acute Respiratory Syndrome corona virus (SARS-CoV) represent a serious public health threat as emerging pandemics. Despite efforts to identify the critical interactions of these viruses with host machinery, the key regulatory events that lead to disease pathology remain poorly targeted with therapeutics. Here we implement an integrated network interrogation approach, in which proteome and transcriptome datasets from infection of both viruses in human lung epithelial cells are utilized to predict regulatory genes involved in the host response. We take advantage of a novel “crowd-based” approach to identify and combine ranking metrics that isolate genes/proteins likely related to the pathogenicity of SARS-CoV and influenza virus. Subsequently, a multivariate regression model is used to compare predicted lung epithelial regulatory influences with data derived from other respiratory virus infection models. We predicted a small set of regulatory factors with conserved behavior for consideration as important components of viral pathogenesis that might also serve as therapeutic targets for intervention. Our results demonstrate the utility of integrating diverse ‘omic datasets to predict and prioritize regulatory features conserved across multiple pathogen infection models.

  12. In silico analysis of cis-acting regulatory elements in 5' regulatory regions of sucrose transporter gene families in rice (Oryza sativa Japonica) and Arabidopsis thaliana.

    Science.gov (United States)

    Ibraheem, Omodele; Botha, Christiaan E J; Bradley, Graeme

    2010-12-01

    The regulation of gene expression involves a multifarious regulatory system. Each gene contains a unique combination of cis-acting regulatory sequence elements in the 5' regulatory region that determines its temporal and spatial expression. Cis-acting regulatory elements are essential transcriptional gene regulatory units; they control many biological processes and stress responses. Thus a full understanding of the transcriptional gene regulation system will depend on successful functional analyses of cis-acting elements. Cis-acting regulatory elements present within the 5' regulatory region of the sucrose transporter gene families in rice (Oryza sativa Japonica cultivar-group) and Arabidopsis thaliana, were identified using a bioinformatics approach. The possible cis-acting regulatory elements were predicted by scanning 1.5kbp of 5' regulatory regions of the sucrose transporter genes translational start sites, using Plant CARE, PLACE and Genomatix Matinspector professional databases. Several cis-acting regulatory elements that are associated with plant development, plant hormonal regulation and stress response were identified, and were present in varying frequencies within the 1.5kbp of 5' regulatory region, among which are; A-box, RY, CAT, Pyrimidine-box, Sucrose-box, ABRE, ARF, ERE, GARE, Me-JA, ARE, DRE, GA-motif, GATA, GT-1, MYC, MYB, W-box, and I-box. This result reveals the probable cis-acting regulatory elements that possibly are involved in the expression and regulation of sucrose transporter gene families in rice and Arabidopsis thaliana during cellular development or environmental stress conditions. Copyright © 2010 Elsevier Ltd. All rights reserved.

  13. Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies.

    Directory of Open Access Journals (Sweden)

    Ranyee A Chiang

    2008-08-01

    Full Text Available The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized

  14. Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies.

    Science.gov (United States)

    Chiang, Ranyee A; Sali, Andrej; Babbitt, Patricia C

    2008-08-01

    The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized

  15. Conservation of the Type IV secretion system throughout Wolbachia evolution

    DEFF Research Database (Denmark)

    Pichon, Samuel; Bouchon, Didier; Cordaux, Richard

    2009-01-01

    , encoding a T4SS were previously identified and characterized at two separate genomic loci. Using the largest data set of Wolbachia strains studied so far, we show that vir gene sequence and organization are strictly conserved among 37 Wolbachia strains inducing various phenotypes such as cytoplasmic...... incompatibility, feminization, or oogenesis in their arthropod hosts. In sharp contrast, extensive variation of genomic sequences flanking the virB8-D4 operon suggested its distinct location among Wolbachia genomes. Long term conservation of the T4SS may imply maintenance of a functional effector translocation...... system in Wolbachia, thereby suggesting the importance for the T4SS in Wolbachia biology and survival inside host cells....

  16. Ultraconservation identifies a small subset of extremely constrained developmental enhancers

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Visel, Axel; Prabhakar, Shyam; Akiyama, Jennifer A.; Shoukry, Malak; Lewis, Keith D.; Holt, Amy; Plajzer-Frick, Ingrid; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2007-10-01

    While experimental studies have suggested that non-coding ultraconserved DNA elements are central nodes in the regulatory circuitry that specifies mammalian embryonic development, the possible functional relevance of their>200bp of perfect sequence conservation between human-mouse-rat remains obscure 1,2. Here we have compared the in vivo enhancer activity of a genome-wide set of 231 non-exonic sequences with ultraconserved cores to that of 206 sequences that are under equivalently severe human-rodent constraint (ultra-like), but lack perfect sequence conservation. In transgenic mouse assays, 50percent of the ultraconserved and 50percent of the ultra-like conserved elements reproducibly functioned as tissue-specific enhancers at embryonic day 11.5. In this in vivo assay, we observed that ultraconserved enhancers and constrained non-ultraconserved enhancers targeted expression to a similar spectrum of tissues with a particular enrichment in the developing central nervous system. A human genome-wide comparative screen uncovered ~;;2,600 non-coding elements that evolved under ultra-like human-rodent constraint and are similarly enriched near transcriptional regulators and developmental genes as the much smaller number of ultraconserved elements. These data indicate that ultraconserved elements possessing absolute human-rodent sequence conservation are not distinct from other non-coding elements that are under comparable purifying selection in mammals and suggest they are principal constituents of the cis-regulatory framework of mammalian development.

  17. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    Science.gov (United States)

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding

  18. A conserved two-component signal transduction system controls the response to phosphate starvation in Bifidobacterium breve UCC2003.

    Science.gov (United States)

    Alvarez-Martin, Pablo; Fernández, Matilde; O'Connell-Motherway, Mary; O'Connell, Kerry Joan; Sauvageot, Nicolas; Fitzgerald, Gerald F; MacSharry, John; Zomer, Aldert; van Sinderen, Douwe

    2012-08-01

    This work reports on the identification and molecular characterization of the two-component regulatory system (2CRS) PhoRP, which controls the response to inorganic phosphate (P(i)) starvation in Bifidobacterium breve UCC2003. The response regulator PhoP was shown to bind to the promoter region of pstSCAB, specifying a predicted P(i) transporter system, as well as that of phoU, which encodes a putative P(i)-responsive regulatory protein. This interaction is assumed to cause transcriptional modulation under conditions of P(i) limitation. Our data suggest that the phoRP genes are subject to positive autoregulation and, together with pstSCAB and presumably phoU, represent the complete regulon controlled by the phoRP-encoded 2CRS in B. breve UCC2003. Determination of the minimal PhoP binding region combined with bioinformatic analysis revealed the probable recognition sequence of PhoP, designated here as the PHO box, which together with phoRP is conserved among many high-GC-content Gram-positive bacteria. The importance of the phoRP 2CRS in the response of B. breve to P(i) starvation conditions was confirmed by analysis of a B. breve phoP insertion mutant which exhibited decreased growth under phosphate-limiting conditions compared to its parent strain UCC2003.

  19. Proteome-wide analysis of lysine acetylation suggests its broad regulatory scope in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Henriksen, Peter; Wagner, Sebastian Alexander; Weinert, Brian Tate

    2012-01-01

    Post-translational modification of proteins by lysine acetylation plays important regulatory roles in living cells. The budding yeast Saccharomyces cerevisiae is a widely used unicellular eukaryotic model organism in biomedical research. S. cerevisiae contains several evolutionary conserved lysine...... acetyltransferases and deacetylases. However, only a few dozen acetylation sites in S. cerevisiae are known, presenting a major obstacle for further understanding the regulatory roles of acetylation in this organism. Here we use high resolution mass spectrometry to identify about 4000 lysine acetylation sites in S....... cerevisiae. Acetylated proteins are implicated in the regulation of diverse cytoplasmic and nuclear processes including chromatin organization, mitochondrial metabolism, and protein synthesis. Bioinformatic analysis of yeast acetylation sites shows that acetylated lysines are significantly more conserved...

  20. RegTransBase - A Database Of Regulatory Sequences and Interactionsin a Wide Range of Prokaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kazakov, Alexei E.; Cipriano, Michael J.; Novichkov, Pavel S.; Minovitsky, Simon; Vinogradov, Dmitry V.; Arkin, Adam; Mironov, AndreyA.; Gelfand, Mikhail S.; Dubchak, Inna

    2006-07-01

    RegTransBase, a manually curated database of regulatoryinteractions in prokaryotes, captures the knowledge in publishedscientific literature using a controlled vocabulary. Although a number ofdatabases describing interactions between regulatory proteins and theirbinding sites are currently being maintained, they focus mostly on themodel organisms Escherichia coli and Bacillus subtilis, or are entirelycomputationally derived. RegTransBase describes a large number ofregulatory interactions reported in many organisms and contains varioustypes of experimental data, in particular: the activation or repressionof transcription by an identified direct regulator; determining thetranscriptional regulatory function of a protein (or RNA) directlybinding to DNA (RNA); mapping or prediction of binding site for aregulatory protein; characterization of regulatory mutations. Currently,the RegTransBase content is derived from about 3000 relevant articlesdescribing over 7000 experiments in relation to 128 microbes. It containsdata on the regulation of about 7500 genes and evidence for 6500interactions with 650 regulators. RegTransBase also contains manuallycreated position weight matrices (PWM) that can be used to identifycandidate regulatory sites in over 60 species. RegTransBase is availableat http://regtransbase.lbl.gov.

  1. Tissue-specific expression of aryl hydrocarbon receptor and putative developmental regulatory modules in Baltic salmon yolk-sac fry

    Energy Technology Data Exchange (ETDEWEB)

    Vuori, Kristiina A. [Centre of Excellence in Evolutionary Genetics and Physiology, Department of Biology, University of Turku, FI-20014 Turku (Finland)], E-mail: kristiina.vuori@utu.fi; Nordlund, Eija [Department of Information Technology, University of Turku, and Turku Centre for Computer Science (TUCS), FI-20014 Turku (Finland); Kallio, Jenny [Centre of Excellence in Evolutionary Genetics and Physiology, Department of Biology, University of Turku, FI-20014 Turku (Finland); Salakoski, Tapio [Department of Information Technology, University of Turku, and Turku Centre for Computer Science (TUCS), FI-20014 Turku (Finland); Nikinmaa, Mikko [Centre of Excellence in Evolutionary Genetics and Physiology, Department of Biology, University of Turku, FI-20014 Turku (Finland)

    2008-04-08

    The aryl hydrocarbon receptor (AhR) is an ancient protein that is conserved in vertebrates and invertebrates, indicating its important function throughout evolution. AhR has been studied largely because of its role in toxicology-gene expression via AhR is induced by many aromatic hydrocarbons in mammals. Recently, however, it has become clear that AhR is involved in various aspects of development such as cell proliferation and differentiation, and cell motility and migration. The mechanisms by which AhR regulates these various functions remain poorly understood. Across-species comparative studies of AhR in invertebrates, non-mammalian vertebrates and mammals may help to reveal the multiple functions of AhR. Here, we have studied AhR during larval development of Baltic salmon (Salmon salar). Our results indicate that AhR protein is expressed in nervous system, liver and muscle tissues. We also present putative regulatory modules and module-matching genes, produced by chromatin immunoprecipitation (ChIP) cloning and in silico analysis, which may be associated with evolutionarily conserved functions of AhR during development. For example, the module NFKB-AHRR-CREB found from salmon ChIP sequences is present in human ULK3 (regulating formation of granule cell axons in mouse and axon outgrowth in Caernohabditis elegans) and SRGAP1 (GTPase-activating protein involved in the Slit/Robo pathway) promoters. We suggest that AhR may have an evolutionarily conserved role in neuronal development and nerve cell targeting, and in Wnt signaling pathway.

  2. Tissue-specific expression of aryl hydrocarbon receptor and putative developmental regulatory modules in Baltic salmon yolk-sac fry

    International Nuclear Information System (INIS)

    Vuori, Kristiina A.; Nordlund, Eija; Kallio, Jenny; Salakoski, Tapio; Nikinmaa, Mikko

    2008-01-01

    The aryl hydrocarbon receptor (AhR) is an ancient protein that is conserved in vertebrates and invertebrates, indicating its important function throughout evolution. AhR has been studied largely because of its role in toxicology-gene expression via AhR is induced by many aromatic hydrocarbons in mammals. Recently, however, it has become clear that AhR is involved in various aspects of development such as cell proliferation and differentiation, and cell motility and migration. The mechanisms by which AhR regulates these various functions remain poorly understood. Across-species comparative studies of AhR in invertebrates, non-mammalian vertebrates and mammals may help to reveal the multiple functions of AhR. Here, we have studied AhR during larval development of Baltic salmon (Salmon salar). Our results indicate that AhR protein is expressed in nervous system, liver and muscle tissues. We also present putative regulatory modules and module-matching genes, produced by chromatin immunoprecipitation (ChIP) cloning and in silico analysis, which may be associated with evolutionarily conserved functions of AhR during development. For example, the module NFKB-AHRR-CREB found from salmon ChIP sequences is present in human ULK3 (regulating formation of granule cell axons in mouse and axon outgrowth in Caernohabditis elegans) and SRGAP1 (GTPase-activating protein involved in the Slit/Robo pathway) promoters. We suggest that AhR may have an evolutionarily conserved role in neuronal development and nerve cell targeting, and in Wnt signaling pathway

  3. Strategies for measuring evolutionary conservation of RNA secondary structures

    Directory of Open Access Journals (Sweden)

    Hofacker Ivo L

    2008-02-01

    Full Text Available Abstract Background Evolutionary conservation of RNA secondary structure is a typical feature of many functional non-coding RNAs. Since almost all of the available methods used for prediction and annotation of non-coding RNA genes rely on this evolutionary signature, accurate measures for structural conservation are essential. Results We systematically assessed the ability of various measures to detect conserved RNA structures in multiple sequence alignments. We tested three existing and eight novel strategies that are based on metrics of folding energies, metrics of single optimal structure predictions, and metrics of structure ensembles. We find that the folding energy based SCI score used in the RNAz program and a simple base-pair distance metric are by far the most accurate. The use of more complex metrics like for example tree editing does not improve performance. A variant of the SCI performed particularly well on highly conserved alignments and is thus a viable alternative when only little evolutionary information is available. Surprisingly, ensemble based methods that, in principle, could benefit from the additional information contained in sub-optimal structures, perform particularly poorly. As a general trend, we observed that methods that include a consensus structure prediction outperformed equivalent methods that only consider pairwise comparisons. Conclusion Structural conservation can be measured accurately with relatively simple and intuitive metrics. They have the potential to form the basis of future RNA gene finders, that face new challenges like finding lineage specific structures or detecting mis-aligned sequences.

  4. A waste package strategy for regulatory compliance

    International Nuclear Information System (INIS)

    Stahl, D.; Cloninger, M.O.

    1990-01-01

    This paper summarizes the strategy given in the Site Characterization Plan for demonstrating compliance with the post closure performance objectives for the waste package and the Engineered Barrier System contained in the Code of Federal Regulations. The strategy consists of the development of a conservative waste package design that will meet the regulatory requirements with sufficient margin for uncertainty using a multi-barrier approach that takes advantage of the unsaturated nature of the Yucca Mountain site. 7 refs., 1 fig

  5. Comparison of Five Major Trichome Regulatory Genes in Brassica villosa with Orthologues within the Brassicaceae

    Science.gov (United States)

    Nayidu, Naghabushana K.; Kagale, Sateesh; Taheri, Ali; Withana-Gamage, Thushan S.; Parkin, Isobel A. P.; Sharpe, Andrew G.; Gruber, Margaret Y.

    2014-01-01

    Coding sequences for major trichome regulatory genes, including the positive regulators GLABRA 1(GL1), GLABRA 2 (GL2), ENHANCER OF GLABRA 3 (EGL3), and TRANSPARENT TESTA GLABRA 1 (TTG1) and the negative regulator TRIPTYCHON (TRY), were cloned from wild Brassica villosa, which is characterized by dense trichome coverage over most of the plant. Transcript (FPKM) levels from RNA sequencing indicated much higher expression of the GL2 and TTG1 regulatory genes in B. villosa leaves compared with expression levels of GL1 and EGL3 genes in either B. villosa or the reference genome species, glabrous B. oleracea; however, cotyledon TTG1 expression was high in both species. RNA sequencing and Q-PCR also revealed an unusual expression pattern for the negative regulators TRY and CPC, which were much more highly expressed in trichome-rich B. villosa leaves than in glabrous B. oleracea leaves and in glabrous cotyledons from both species. The B. villosa TRY expression pattern also contrasted with TRY expression patterns in two diploid Brassica species, and with the Arabidopsis model for expression of negative regulators of trichome development. Further unique sequence polymorphisms, protein characteristics, and gene evolution studies highlighted specific amino acids in GL1 and GL2 coding sequences that distinguished glabrous species from hairy species and several variants that were specific for each B. villosa gene. Positive selection was observed for GL1 between hairy and non-hairy plants, and as expected the origin of the four expressed positive trichome regulatory genes in B. villosa was predicted to be from B. oleracea. In particular the unpredicted expression patterns for TRY and CPC in B. villosa suggest additional characterization is needed to determine the function of the expanded families of trichome regulatory genes in more complex polyploid species within the Brassicaceae. PMID:24755905

  6. Evolutionary conservation of vertebrate notochord genes in the ascidian Ciona intestinalis.

    Science.gov (United States)

    Kugler, Jamie E; Passamaneck, Yale J; Feldman, Taya G; Beh, Jeni; Regnier, Todd W; Di Gregorio, Anna

    2008-11-01

    To reconstruct a minimum complement of notochord genes evolutionarily conserved across chordates, we scanned the Ciona intestinalis genome using the sequences of 182 genes reported to be expressed in the notochord of different vertebrates and identified 139 candidate notochord genes. For 66 of these Ciona genes expression data were already available, hence we analyzed the expression of the remaining 73 genes and found notochord expression for 20. The predicted products of the newly identified notochord genes range from the transcription factors Ci-XBPa and Ci-miER1 to extracellular matrix proteins. We examined the expression of the newly identified notochord genes in embryos ectopically expressing Ciona Brachyury (Ci-Bra) and in embryos expressing a repressor form of this transcription factor in the notochord, and we found that while a subset of the genes examined are clearly responsive to Ci-Bra, other genes are not affected by alterations in its levels. We provide a first description of notochord genes that are not evidently influenced by the ectopic expression of Ci-Bra and we propose alternative regulatory mechanisms that might control their transcription. Copyright 2008 Wiley-Liss, Inc.

  7. Regulatory Roles for Long ncRNA and mRNA

    International Nuclear Information System (INIS)

    Karapetyan, Armen R.; Buiting, Coen; Kuiper, Renske A.; Coolen, Marcel W.

    2013-01-01

    Recent advances in high-throughput sequencing technology have identified the transcription of a much larger portion of the genome than previously anticipated. Especially in the context of cancer it has become clear that aberrant transcription of both protein-coding and long non-coding RNAs (lncRNAs) are frequent events. The current dogma of RNA function describes mRNA to be responsible for the synthesis of proteins, whereas non-coding RNA can have regulatory or epigenetic functions. However, this distinction between protein coding and regulatory ability of transcripts may not be that strict. Here, we review the increasing body of evidence for the existence of multifunctional RNAs that have both protein-coding and trans-regulatory roles. Moreover, we demonstrate that coding transcripts bind to components of the Polycomb Repressor Complex 2 (PRC2) with similar affinities as non-coding transcripts, revealing potential epigenetic regulation by mRNAs. We hypothesize that studies on the regulatory ability of disease-associated mRNAs will form an important new field of research

  8. Regulatory Roles for Long ncRNA and mRNA

    Energy Technology Data Exchange (ETDEWEB)

    Karapetyan, Armen R.; Buiting, Coen; Kuiper, Renske A.; Coolen, Marcel W., E-mail: M.Coolen@gen.umcn.nl [Department of Human Genetics, Nijmegen Centre for Molecular Life Sciences (NCMLS), Radboud University Nijmegen Medical Centre, P.O. Box 9101, Nijmegen 6500 HB (Netherlands)

    2013-04-26

    Recent advances in high-throughput sequencing technology have identified the transcription of a much larger portion of the genome than previously anticipated. Especially in the context of cancer it has become clear that aberrant transcription of both protein-coding and long non-coding RNAs (lncRNAs) are frequent events. The current dogma of RNA function describes mRNA to be responsible for the synthesis of proteins, whereas non-coding RNA can have regulatory or epigenetic functions. However, this distinction between protein coding and regulatory ability of transcripts may not be that strict. Here, we review the increasing body of evidence for the existence of multifunctional RNAs that have both protein-coding and trans-regulatory roles. Moreover, we demonstrate that coding transcripts bind to components of the Polycomb Repressor Complex 2 (PRC2) with similar affinities as non-coding transcripts, revealing potential epigenetic regulation by mRNAs. We hypothesize that studies on the regulatory ability of disease-associated mRNAs will form an important new field of research.

  9. A Comparative Transcriptomic Analysis Reveals Conserved Features of Stem Cell Pluripotency in Planarians and Mammals

    Science.gov (United States)

    Labbé, Roselyne M.; Irimia, Manuel; Currie, Ko W.; Lin, Alexander; Zhu, Shu Jun; Brown, David D.R.; Ross, Eric J.; Voisin, Veronique; Bader, Gary D.; Blencowe, Benjamin J.; Pearson, Bret J.

    2014-01-01

    Many long-lived species of animals require the function of adult stem cells throughout their lives. However, the transcriptomes of stem cells in invertebrates and vertebrates have not been compared, and consequently, ancestral regulatory circuits that control stem cell populations remain poorly defined. In this study, we have used data from high-throughput RNA sequencing to compare the transcriptomes of pluripotent adult stem cells from planarians with the transcriptomes of human and mouse pluripotent embryonic stem cells. From a stringently defined set of 4,432 orthologs shared between planarians, mice and humans, we identified 123 conserved genes that are ≥5-fold differentially expressed in stem cells from all three species. Guided by this gene set, we used RNAi screening in adult planarians to discover novel stem cell regulators, which we found to affect the stem cell-associated functions of tissue homeostasis, regeneration, and stem cell maintenance. Examples of genes that disrupted these processes included the orthologs of TBL3, PSD12, TTC27, and RACK1. From these analyses, we concluded that by comparing stem cell transcriptomes from diverse species, it is possible to uncover conserved factors that function in stem cell biology. These results provide insights into which genes comprised the ancestral circuitry underlying the control of stem cell self-renewal and pluripotency. PMID:22696458

  10. Analysis of Schizosaccharomyces pombe mediator reveals a set of essential subunits conserved between yeast and metazoan cells

    DEFF Research Database (Denmark)

    Spåhr, H; Samuelsen, C O; Baraznenok, V

    2001-01-01

    . cerevisiae share an essential protein module, which associates with nonessential speciesspecific subunits. In support of this view, sequence analysis of the conserved yeast Mediator components Med4 and Med8 reveals sequence homology to the metazoan Mediator components Trap36 and Arc32. Therefore, 8 of 10...... essential genes conserved between S. pombe and S. cerevisiae also have a metazoan homolog, indicating that an evolutionary conserved Mediator core is present in all eukaryotic cells. Our data suggest a closer functional relationship between yeast and metazoan Mediator than previously anticipated....

  11. Subfamily logos: visualization of sequence deviations at alignment positions with high information content

    Directory of Open Access Journals (Sweden)

    Beitz Eric

    2006-06-01

    Full Text Available Abstract Background Recognition of relevant sequence deviations can be valuable for elucidating functional differences between protein subfamilies. Interesting residues at highly conserved positions can then be mutated and experimentally analyzed. However, identification of such sites is tedious because automated approaches are scarce. Results Subfamily logos visualize subfamily-specific sequence deviations. The display is similar to classical sequence logos but extends into the negative range. Positive, upright characters correspond to residues which are characteristic for the subfamily, negative, upside-down characters to residues typical for the remaining sequences. The symbol height is adjusted to the information content of the alignment position. Residues which are conserved throughout do not appear. Conclusion Subfamily logos provide an intuitive display of relevant sequence deviations. The method has proven to be valid using a set of 135 aligned aquaporin sequences in which established subfamily-specific positions were readily identified by the algorithm.

  12. G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    John A Capra

    2010-07-01

    Full Text Available G-quadruplex DNA is a four-stranded DNA structure formed by non-Watson-Crick base pairing between stacked sets of four guanines. Many possible functions have been proposed for this structure, but its in vivo role in the cell is still largely unresolved. We carried out a genome-wide survey of the evolutionary conservation of regions with the potential to form G-quadruplex DNA structures (G4 DNA motifs across seven yeast species. We found that G4 DNA motifs were significantly more conserved than expected by chance, and the nucleotide-level conservation patterns suggested that the motif conservation was the result of the formation of G4 DNA structures. We characterized the association of conserved and non-conserved G4 DNA motifs in Saccharomyces cerevisiae with more than 40 known genome features and gene classes. Our comprehensive, integrated evolutionary and functional analysis confirmed the previously observed associations of G4 DNA motifs with promoter regions and the rDNA, and it identified several previously unrecognized associations of G4 DNA motifs with genomic features, such as mitotic and meiotic double-strand break sites (DSBs. Conserved G4 DNA motifs maintained strong associations with promoters and the rDNA, but not with DSBs. We also performed the first analysis of G4 DNA motifs in the mitochondria, and surprisingly found a tenfold higher concentration of the motifs in the AT-rich yeast mitochondrial DNA than in nuclear DNA. The evolutionary conservation of the G4 DNA motif and its association with specific genome features supports the hypothesis that G4 DNA has in vivo functions that are under evolutionary constraint.

  13. Evaluation of selected near-term energy-conservation options for the Midwest

    Energy Technology Data Exchange (ETDEWEB)

    Evans, A.R.; Colsher, C.S.; Hamilton, R.W.; Buehring, W.A.

    1978-11-01

    This report evaluates the potential for implementation of near-term energy-conservation practices for the residential, commercial, agricultural, industrial, transportation, and utility sectors of the economy in twelve states: Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, and Wisconsin. The information used to evaluate the magnitude of achievable energy savings includes regional energy use, the regulatory/legislative climate relating to energy conservation, technical characteristics of the measures, and their feasibility of implementation. This work is intended to provide baseline information for an ongoing regional assessment of energy and environmental impacts in the Midwest. 80 references.

  14. Identification of a cis-regulatory element by transient analysis of co-ordinately regulated genes

    Directory of Open Access Journals (Sweden)

    Allan Andrew C

    2008-07-01

    Full Text Available Abstract Background Transcription factors (TFs co-ordinately regulate target genes that are dispersed throughout the genome. This co-ordinate regulation is achieved, in part, through the interaction of transcription factors with conserved cis-regulatory motifs that are in close proximity to the target genes. While much is known about the families of transcription factors that regulate gene expression in plants, there are few well characterised cis-regulatory motifs. In Arabidopsis, over-expression of the MYB transcription factor PAP1 (PRODUCTION OF ANTHOCYANIN PIGMENT 1 leads to transgenic plants with elevated anthocyanin levels due to the co-ordinated up-regulation of genes in the anthocyanin biosynthetic pathway. In addition to the anthocyanin biosynthetic genes, there are a number of un-associated genes that also change in expression level. This may be a direct or indirect consequence of the over-expression of PAP1. Results Oligo array analysis of PAP1 over-expression Arabidopsis plants identified genes co-ordinately up-regulated in response to the elevated expression of this transcription factor. Transient assays on the promoter regions of 33 of these up-regulated genes identified eight promoter fragments that were transactivated by PAP1. Bioinformatic analysis on these promoters revealed a common cis-regulatory motif that we showed is required for PAP1 dependent transactivation. Conclusion Co-ordinated gene regulation by individual transcription factors is a complex collection of both direct and indirect effects. Transient transactivation assays provide a rapid method to identify direct target genes from indirect target genes. Bioinformatic analysis of the promoters of these direct target genes is able to locate motifs that are common to this sub-set of promoters, which is impossible to identify with the larger set of direct and indirect target genes. While this type of analysis does not prove a direct interaction between protein and DNA

  15. The genome sequence of Geobacter metallireducens: features of metabolism, physiology and regulation common and dissimilar to Geobacter sulfurreducens

    Energy Technology Data Exchange (ETDEWEB)

    Aklujkar, Muktak; Krushkal, Julia; DiBartolo, Genevieve; Lapidus, Alla; Land, Miriam L.; Lovley, Derek R.

    2008-12-01

    Background: The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results: The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion: The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae.

  16. H-NS Facilitates Sequence Diversification of Horizontally Transferred DNAs during Their Integration in Host Chromosomes.

    Directory of Open Access Journals (Sweden)

    Koichi Higashi

    2016-01-01

    Full Text Available Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches.

  17. H-NS Facilitates Sequence Diversification of Horizontally Transferred DNAs during Their Integration in Host Chromosomes.

    Science.gov (United States)

    Higashi, Koichi; Tobe, Toru; Kanai, Akinori; Uyar, Ebru; Ishikawa, Shu; Suzuki, Yutaka; Ogasawara, Naotake; Kurokawa, Ken; Oshima, Taku

    2016-01-01

    Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches.

  18. Detecting the limits of regulatory element conservation anddivergence estimation using pairwise and multiple alignments

    Energy Technology Data Exchange (ETDEWEB)

    Pollard, Daniel A.; Moses, Alan M.; Iyer, Venky N.; Eisen,Michael B.

    2006-08-14

    Background: Molecular evolutionary studies of noncodingsequences rely on multiple alignments. Yet how multiple alignmentaccuracy varies across sequence types, tree topologies, divergences andtools, and further how this variation impacts specific inferences,remains unclear. Results: Here we develop a molecular evolutionsimulation platform, CisEvolver, with models of background noncoding andtranscription factor binding site evolution, and use simulated alignmentsto systematically examine multiple alignment accuracy and its impact ontwo key molecular evolutionary inferences: transcription factor bindingsite conservation and divergence estimation. We find that the accuracy ofmultiple alignments is determined almost exclusively by the pairwisedivergence distance of the two most diverged species and that additionalspecies have a negligible influence on alignment accuracy. Conservedtranscription factor binding sites align better than surroundingnoncoding DNA yet are often found to be misaligned at relatively shortdivergence distances, such that studies of binding site gain and losscould easily be confounded by alignment error. Divergence estimates frommultiple alignments tend to be overestimated at short divergencedistances but reach a tool specific divergence at which they cease toincrease, leading to underestimation at long divergences. Our moststriking finding was that overall alignment accuracy, binding sitealignment accuracy and divergence estimation accuracy vary greatly acrossbranches in a tree and are most accurate for terminal branches connectingsister taxa and least accurate for internal branches connectingsub-alignments. Conclusions: Our results suggest that variation inalignment accuracy can lead to errors in molecular evolutionaryinferences that could be construed as biological variation. Thesefindings have implications for which species to choose for analyses, whatkind of errors would be expected for a given set of species and howmultiple alignment tools and

  19. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Science.gov (United States)

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  20. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    Directory of Open Access Journals (Sweden)

    Satish K Guttikonda

    Full Text Available Demand for the commercial use of genetically modified (GM crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  1. Prevalence of transcription promoters within archaeal operons and coding sequences.

    Science.gov (United States)

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.

  2. Cloning of the cDNA for murine von Willebrand factor and identification of orthologous genes reveals the extent of conservation among diverse species.

    Science.gov (United States)

    Chitta, Mohan S; Duhé, Roy J; Kermode, John C

    2007-05-01

    Interaction of von Willebrand factor (VWF) with circulating platelets promotes hemostasis when a blood vessel is injured. The A1 domain of VWF is responsible for the initial interaction with platelets and is well conserved among species. Knowledge of the cDNA and genomic DNA sequences for human VWF allowed us to predict the cDNA sequence for murine VWF in silico and amplify its entire coding region by RT-PCR. The murine VWF cDNA has an open reading frame of 8,442 bp, encoding a protein of 2,813 amino acid residues with 83% identity to human pre-pro-VWF. The same strategy was used to predict in silico the cDNA sequence for the ortholog of VWF in a further six species. Many of these predictions diverged substantially from the putative Reference Sequences derived by ab initio methods. Our predicted sequences indicated that the VWF gene has a conserved structure of 52 exons in all seven mammalian species examined, as well as in the chicken. There is a minor structural variation in the pufferfish Takifugu rubripes insofar as the VWF gene in this species has 53 exons. Comparison of the translated amino acid sequences also revealed a high degree of conservation. In particular, the cysteine residues are conserved precisely throughout both the pro-peptide and the mature VWF sequence in all species, with a minor exception in the pufferfish VWF ortholog where two adjacent cysteine residues are omitted. The marked conservation of cysteine residues emphasizes the importance of the intricate pattern of disulfide bonds in governing the structure of pro-VWF and regulating the function of the mature VWF protein. It should also be emphasized that many of the conserved features of the VWF gene and protein were obscured when the comparison among species was based on the putative Reference Sequences instead of our predicted cDNA sequences.

  3. Sequence of cDNAs for mammalian H2A. Z, an evolutionarily diverged but highly conserved basal histone H2A isoprotein species

    Energy Technology Data Exchange (ETDEWEB)

    Hatch, C L; Bonner, W M

    1988-02-11

    The nucleotide sequences of cDNAs for the evolutionarily diverged but highly conserved basal H2A isoprotein, H2A.Z, have been determined for the rat, cow, and human. As a basal histone, H2A.Z is synthesized throughout the cell cycle at a constant rate, unlinked to DNA replication, and at a much lower rate in quiescent cells. Each of the cDNA isolates encodes the entire H2A.Z polypeptide. The human isolate is about 1.0 kilobases long. It contains a coding region of 387 nucleotides flanked by 106 nucleotides of 5'UTR and 376 nucleotides of 3'UTR, which contains a polyadenylation signal followed by a poly A tail. The bovine and rat cDNAs have 97 and 94% nucleotide positional identity to the human cDNA in the coding region and 98% in the proximal 376 nucleotides of the 3'UTR which includes the polyadenylation signal. A potential stem-forming sequence imbedded in a direct repeat is found centered at 261 nucleotides into the 3'UTR. Each of the cDNA clones could be transcribed and translated in vitro to yield H2A.Z protein. The mammalian H2A.Z cDNA coding sequences are approximately 80% similar to those in chicken and 75% to those in sea urchin.

  4. Pathogenic adaptation of intracellular bacteria by rewiring a cis-regulatory input function.

    Science.gov (United States)

    Osborne, Suzanne E; Walthers, Don; Tomljenovic, Ana M; Mulder, David T; Silphaduang, Uma; Duong, Nancy; Lowden, Michael J; Wickham, Mark E; Waller, Ross F; Kenney, Linda J; Coombes, Brian K

    2009-03-10

    The acquisition of DNA by horizontal gene transfer enables bacteria to adapt to previously unexploited ecological niches. Although horizontal gene transfer and mutation of protein-coding sequences are well-recognized forms of pathogen evolution, the evolutionary significance of cis-regulatory mutations in creating phenotypic diversity through altered transcriptional outputs is not known. We show the significance of regulatory mutation for pathogen evolution by mapping and then rewiring a cis-regulatory module controlling a gene required for murine typhoid. Acquisition of a binding site for the Salmonella pathogenicity island-2 regulator, SsrB, enabled the srfN gene, ancestral to the Salmonella genus, to play a role in pathoadaptation of S. typhimurium to a host animal. We identified the evolved cis-regulatory module and quantified the fitness gain that this regulatory output accrues for the bacterium using competitive infections of host animals. Our findings highlight a mechanism of pathogen evolution involving regulatory mutation that is selected because of the fitness advantage the new regulatory output provides the incipient clones.

  5. Evolutionary Conservation of the Components in the TOR Signaling Pathways.

    Science.gov (United States)

    Tatebe, Hisashi; Shiozaki, Kazuhiro

    2017-11-01

    Target of rapamycin (TOR) is an evolutionarily conserved protein kinase that controls multiple cellular processes upon various intracellular and extracellular stimuli. Since its first discovery, extensive studies have been conducted both in yeast and animal species including humans. Those studies have revealed that TOR forms two structurally and physiologically distinct protein complexes; TOR complex 1 (TORC1) is ubiquitous among eukaryotes including animals, yeast, protozoa, and plants, while TOR complex 2 (TORC2) is conserved in diverse eukaryotic species other than plants. The studies have also identified two crucial regulators of mammalian TORC1 (mTORC1), Ras homolog enriched in brain (RHEB) and RAG GTPases. Of these, RAG regulates TORC1 in yeast as well and is conserved among eukaryotes with the green algae and land plants as apparent exceptions. RHEB is present in various eukaryotes but sporadically missing in multiple taxa. RHEB, in the budding yeast Saccharomyces cerevisiae , appears to be extremely divergent with concomitant loss of its function as a TORC1 regulator. In this review, we summarize the evolutionarily conserved functions of the key regulatory subunits of TORC1 and TORC2, namely RAPTOR, RICTOR, and SIN1. We also delve into the evolutionary conservation of RHEB and RAG and discuss the conserved roles of these GTPases in regulating TORC1.

  6. Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrroly-sine containing genes

    DEFF Research Database (Denmark)

    Have, Christian Theil; Zambach, Sine; Christiansen, Henning

    2013-01-01

    for prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates...... for experimental verification. The method is implemented as a computational pipeline which is available on request....

  7. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    Science.gov (United States)

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to

  8. The most conserved genome segments for life detection on Earth and other planets.

    Science.gov (United States)

    Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary

    2008-12-01

    On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.

  9. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    Science.gov (United States)

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  10. NFAT5 regulates HIV-1 in primary monocytes via a highly conserved long terminal repeat site.

    Directory of Open Access Journals (Sweden)

    Shahin Ranjbar

    2006-12-01

    Full Text Available To replicate, HIV-1 capitalizes on endogenous cellular activation pathways resulting in recruitment of key host transcription factors to its viral enhancer. RNA interference has been a powerful tool for blocking key checkpoints in HIV-1 entry into cells. Here we apply RNA interference to HIV-1 transcription in primary macrophages, a major reservoir of the virus, and specifically target the transcription factor NFAT5 (nuclear factor of activated T cells 5, which is the most evolutionarily divergent NFAT protein. By molecularly cloning and sequencing isolates from multiple viral subtypes, and performing DNase I footprinting, electrophoretic mobility shift, and promoter mutagenesis transfection assays, we demonstrate that NFAT5 functionally interacts with a specific enhancer binding site conserved in HIV-1, HIV-2, and multiple simian immunodeficiency viruses. Using small interfering RNA to ablate expression of endogenous NFAT5 protein, we show that the replication of three major HIV-1 viral subtypes (B, C, and E is dependent upon NFAT5 in human primary differentiated macrophages. Our results define a novel host factor-viral enhancer interaction that reveals a new regulatory role for NFAT5 and defines a functional DNA motif conserved across HIV-1 subtypes and representative simian immunodeficiency viruses. Inhibition of the NFAT5-LTR interaction may thus present a novel therapeutic target to suppress HIV-1 replication and progression of AIDS.

  11. Conservation and co-option in developmental programmes: the importance of homology relationships

    Directory of Open Access Journals (Sweden)

    Becker May-Britt

    2005-10-01

    Full Text Available Abstract One of the surprising insights gained from research in evolutionary developmental biology (evo-devo is that increasing diversity in body plans and morphology in organisms across animal phyla are not reflected in similarly dramatic changes at the level of gene composition of their genomes. For instance, simplicity at the tissue level of organization often contrasts with a high degree of genetic complexity. Also intriguing is the observation that the coding regions of several genes of invertebrates show high sequence similarity to those in humans. This lack of change (conservation indicates that evolutionary novelties may arise more frequently through combinatorial processes, such as changes in gene regulation and the recruitment of novel genes into existing regulatory gene networks (co-option, and less often through adaptive evolutionary processes in the coding portions of a gene. As a consequence, it is of great interest to examine whether the widespread conservation of the genetic machinery implies the same developmental function in a last common ancestor, or whether homologous genes acquired new developmental roles in structures of independent phylogenetic origin. To distinguish between these two possibilities one must refer to current concepts of phylogeny reconstruction and carefully investigate homology relationships. Particularly problematic in terms of homology decisions is the use of gene expression patterns of a given structure. In the future, research on more organisms other than the typical model systems will be required since these can provide insights that are not easily obtained from comparisons among only a few distantly related model species.

  12. Techniques of analyzing the impacts of certain electric-utility ratemaking and regulatory-policy concepts. Bibliography

    Energy Technology Data Exchange (ETDEWEB)

    None

    1980-08-01

    This bibliography provides documentation for use by state public utility commissions and major nonregulated utilities in evaluating the applicability of a wide range of electric utility rate design and regulatory concepts in light of certain regulatory objectives. Part I, Utility Regulatory Objectives, contains 2084 citations on conservation of energy and capital; efficient use of facilities and resources; and equitable rates to electricity consumers. Part II, Rate Design Concepts, contains 1238 citations on time-of-day rates; seasonally-varying rates; cost-of-service rates; interruptible rates (including the accompanying use of load management techniques); declining block rates; and lifeline rates. Part III, Regulatory Concepts, contains 1282 references on restrictions on master metering; procedures for review of automatic adjustment clauses; prohibitions of rate or regulatory discrimination against solar, wind, or other small energy systems; treatment of advertising expenses; and procedures to protect ratepayers from abrupt termination of service.

  13. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences

    Directory of Open Access Journals (Sweden)

    De Marzo Angelo M

    2011-06-01

    Full Text Available Abstract Background DNA methylation has been linked to genome regulation and dysregulation in health and disease respectively, and methods for characterizing genomic DNA methylation patterns are rapidly emerging. We have developed/refined methods for enrichment of methylated genomic fragments using the methyl-binding domain of the human MBD2 protein (MBD2-MBD followed by analysis with high-density tiling microarrays. This MBD-chip approach was used to characterize DNA methylation patterns across all non-repetitive sequences of human chromosomes 21 and 22 at high-resolution in normal and malignant prostate cells. Results Examining this data using computational methods that were designed specifically for DNA methylation tiling array data revealed widespread methylation of both gene promoter and non-promoter regions in cancer and normal cells. In addition to identifying several novel cancer hypermethylated 5' gene upstream regions that mediated epigenetic gene silencing, we also found several hypermethylated 3' gene downstream, intragenic and intergenic regions. The hypermethylated intragenic regions were highly enriched for overlap with intron-exon boundaries, suggesting a possible role in regulation of alternative transcriptional start sites, exon usage and/or splicing. The hypermethylated intergenic regions showed significant enrichment for conservation across vertebrate species. A sampling of these newly identified promoter (ADAMTS1 and SCARF2 genes and non-promoter (downstream or within DSCR9, C21orf57 and HLCS genes hypermethylated regions were effective in distinguishing malignant from normal prostate tissues and/or cell lines. Conclusions Comparison of chromosome-wide DNA methylation patterns in normal and malignant prostate cells revealed significant methylation of gene-proximal and conserved intergenic sequences. Such analyses can be easily extended for genome-wide methylation analysis in health and disease.

  14. The role of heterologous chloroplast sequence elements in transgene integration and expression.

    Science.gov (United States)

    Ruhlman, Tracey; Verma, Dheeraj; Samson, Nalapalli; Daniell, Henry

    2010-04-01

    Heterologous regulatory elements and flanking sequences have been used in chloroplast transformation of several crop species, but their roles and mechanisms have not yet been investigated. Nucleotide sequence identity in the photosystem II protein D1 (psbA) upstream region is 59% across all taxa; similar variation was consistent across all genes and taxa examined. Secondary structure and predicted Gibbs free energy values of the psbA 5' untranslated region (UTR) among different families reflected this variation. Therefore, chloroplast transformation vectors were made for tobacco (Nicotiana tabacum) and lettuce (Lactuca sativa), with endogenous (Nt-Nt, Ls-Ls) or heterologous (Nt-Ls, Ls-Nt) psbA promoter, 5' UTR and 3' UTR, regulating expression of the anthrax protective antigen (PA) or human proinsulin (Pins) fused with the cholera toxin B-subunit (CTB). Unique lettuce flanking sequences were completely eliminated during homologous recombination in the transplastomic tobacco genomes but not unique tobacco sequences. Nt-Ls or Ls-Nt transplastomic lines showed reduction of 80% PA and 97% CTB-Pins expression when compared with endogenous psbA regulatory elements, which accumulated up to 29.6% total soluble protein PA and 72.0% total leaf protein CTB-Pins, 2-fold higher than Rubisco. Transgene transcripts were reduced by 84% in Ls-Nt-CTB-Pins and by 72% in Nt-Ls-PA lines. Transcripts containing endogenous 5' UTR were stabilized in nonpolysomal fractions. Stromal RNA-binding proteins were preferentially associated with endogenous psbA 5' UTR. A rapid and reproducible regeneration system was developed for lettuce commercial cultivars by optimizing plant growth regulators. These findings underscore the need for sequencing complete crop chloroplast genomes, utilization of endogenous regulatory elements and flanking sequences, as well as optimization of plant growth regulators for efficient chloroplast transformation.

  15. Conserved domains and SINE diversity during animal evolution.

    Science.gov (United States)

    Luchetti, Andrea; Mantovani, Barbara

    2013-10-01

    Eukaryotic genomes harbour a number of mobile genetic elements (MGEs); moving from one genomic location to another, they are known to impact on the host genome. Short interspersed elements (SINEs) are well-represented, non-autonomous retroelements and they are likely the most diversified MGEs. In some instances, sequence domains conserved across unrelated SINEs have been identified; remarkably, one of these, called Nin, has been conserved since the Radiata-Bilateria splitting. Here we report on two new domains: Inv, derived from Nin, identified in insects and in deuterostomes, and Pln, restricted to polyneopteran insects. The identification of Inv and Pln sequences allowed us to retrieve new SINEs, two in insects and one in a hemichordate. The diverse structural combination of the different domains in different SINE families, during metazoan evolution, offers a clearer view of SINE diversity and their frequent de novo emergence through module exchange, possibly underlying the high evolutionary success of SINEs. © 2013 Elsevier Inc. All rights reserved.

  16. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    Science.gov (United States)

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  17. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

    Directory of Open Access Journals (Sweden)

    Natasha A Hamilton

    Full Text Available Angiotensin converting enzyme (ACE is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

  18. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

    Science.gov (United States)

    Hamilton, Natasha A; Tammen, Imke; Raadsma, Herman W

    2013-01-01

    Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

  19. Disrupted auto-regulation of the spliceosomal gene SNRPB causes cerebro-costo-mandibular syndrome.

    Science.gov (United States)

    Lynch, Danielle C; Revil, Timothée; Schwartzentruber, Jeremy; Bhoj, Elizabeth J; Innes, A Micheil; Lamont, Ryan E; Lemire, Edmond G; Chodirker, Bernard N; Taylor, Juliet P; Zackai, Elaine H; McLeod, D Ross; Kirk, Edwin P; Hoover-Fong, Julie; Fleming, Leah; Savarirayan, Ravi; Majewski, Jacek; Jerome-Majewska, Loydie A; Parboosingh, Jillian S; Bernier, Francois P

    2014-07-22

    Elucidating the function of highly conserved regulatory sequences is a significant challenge in genomics today. Certain intragenic highly conserved elements have been associated with regulating levels of core components of the spliceosome and alternative splicing of downstream genes. Here we identify mutations in one such element, a regulatory alternative exon of SNRPB as the cause of cerebro-costo-mandibular syndrome. This exon contains a premature termination codon that triggers nonsense-mediated mRNA decay when included in the transcript. These mutations cause increased inclusion of the alternative exon and decreased overall expression of SNRPB. We provide evidence for the functional importance of this conserved intragenic element in the regulation of alternative splicing and development, and suggest that the evolution of such a regulatory mechanism has contributed to the complexity of mammalian development.

  20. Disrupted auto-regulation of the spliceosomal gene SNRPB causes cerebro–costo–mandibular syndrome

    Science.gov (United States)

    Lynch, Danielle C.; Revil, Timothée; Schwartzentruber, Jeremy; Bhoj, Elizabeth J.; Innes, A. Micheil; Lamont, Ryan E.; Lemire, Edmond G.; Chodirker, Bernard N.; Taylor, Juliet P.; Zackai, Elaine H.; McLeod, D. Ross; Kirk, Edwin P.; Hoover-Fong, Julie; Fleming, Leah; Savarirayan, Ravi; Boycott, Kym; MacKenzie, Alex; Brudno, Michael; Bulman, Dennis; Dyment, David; Majewski, Jacek; Jerome-Majewska, Loydie A.; Parboosingh, Jillian S.; Bernier, Francois P.

    2014-01-01

    Elucidating the function of highly conserved regulatory sequences is a significant challenge in genomics today. Certain intragenic highly conserved elements have been associated with regulating levels of core components of the spliceosome and alternative splicing of downstream genes. Here we identify mutations in one such element, a regulatory alternative exon of SNRPB as the cause of cerebro–costo–mandibular syndrome. This exon contains a premature termination codon that triggers nonsense-mediated mRNA decay when included in the transcript. These mutations cause increased inclusion of the alternative exon and decreased overall expression of SNRPB. We provide evidence for the functional importance of this conserved intragenic element in the regulation of alternative splicing and development, and suggest that the evolution of such a regulatory mechanism has contributed to the complexity of mammalian development. PMID:25047197

  1. Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

    Science.gov (United States)

    Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

    2016-04-01

    To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.

  2. The sequence, structure and evolutionary features of HOTAIR in mammals

    Science.gov (United States)

    2011-01-01

    Background An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals

  3. PlantPAN: Plant promoter analysis navigator, for identifying combinatorial cis-regulatory elements with distance constraint in plant gene groups

    Directory of Open Access Journals (Sweden)

    Huang Hsien-Da

    2008-11-01

    Full Text Available Abstract Background The elucidation of transcriptional regulation in plant genes is important area of research for plant scientists, following the mapping of various plant genomes, such as A. thaliana, O. sativa and Z. mays. A variety of bioinformatic servers or databases of plant promoters have been established, although most have been focused only on annotating transcription factor binding sites in a single gene and have neglected some important regulatory elements (tandem repeats and CpG/CpNpG islands in promoter regions. Additionally, the combinatorial interaction of transcription factors (TFs is important in regulating the gene group that is associated with the same expression pattern. Therefore, a tool for detecting the co-regulation of transcription factors in a group of gene promoters is required. Results This study develops a database-assisted system, PlantPAN (Plant Promoter Analysis Navigator, for recognizing combinatorial cis-regulatory elements with a distance constraint in sets of plant genes. The system collects the plant transcription factor binding profiles from PLACE, TRANSFAC (public release 7.0, AGRIS, and JASPER databases and allows users to input a group of gene IDs or promoter sequences, enabling the co-occurrence of combinatorial transcription factor binding sites (TFBSs within a defined distance (20 bp to 200 bp to be identified. Furthermore, the new resource enables other regulatory features in a plant promoter, such as CpG/CpNpG islands and tandem repeats, to be displayed. The regulatory elements in the conserved regions of the promoters across homologous genes are detected and presented. Conclusion In addition to providing a user-friendly input/output interface, PlantPAN has numerous advantages in the analysis of a plant promoter. Several case studies have established the effectiveness of PlantPAN. This novel analytical resource is now freely available at http://PlantPAN.mbc.nctu.edu.tw.

  4. Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

    Science.gov (United States)

    Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2004-02-01

    To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.

  5. Regulatory Roles for Long ncRNA and mRNA

    Directory of Open Access Journals (Sweden)

    Marcel W. Coolen

    2013-04-01

    Full Text Available Recent advances in high-throughput sequencing technology have identified the transcription of a much larger portion of the genome than previously anticipated. Especially in the context of cancer it has become clear that aberrant transcription of both protein-coding and long non-coding RNAs (lncRNAs are frequent events. The current dogma of RNA function describes mRNA to be responsible for the synthesis of proteins, whereas non-coding RNA can have regulatory or epigenetic functions. However, this distinction between protein coding and regulatory ability of transcripts may not be that strict. Here, we review the increasing body of evidence for the existence of multifunctional RNAs that have both protein-coding and trans-regulatory roles. Moreover, we demonstrate that coding transcripts bind to components of the Polycomb Repressor Complex 2 (PRC2 with similar affinities as non-coding transcripts, revealing potential epigenetic regulation by mRNAs. We hypothesize that studies on the regulatory ability of disease-associated mRNAs will form an important new field of research.

  6. Biodiversity maintenance in food webs with regulatory environmental feedbacks.

    Science.gov (United States)

    Bagdassarian, Carey K; Dunham, Amy E; Brown, Christopher G; Rauscher, Daniel

    2007-04-21

    Although the food web is one of the most fundamental and oldest concepts in ecology, elucidating the strategies and structures by which natural communities of species persist remains a challenge to empirical and theoretical ecologists. We show that simple regulatory feedbacks between autotrophs and their environment when embedded within complex and realistic food-web models enhance biodiversity. The food webs are generated through the niche-model algorithm and coupled with predator-prey dynamics, with and without environmental feedbacks at the autotroph level. With high probability and especially at lower, more realistic connectance levels, regulatory environmental feedbacks result in fewer species extinctions, that is, in increased species persistence. These same feedback couplings, however, also sensitize food webs to environmental stresses leading to abrupt collapses in biodiversity with increased forcing. Feedback interactions between species and their material environments anchor food-web persistence, adding another dimension to biodiversity conservation. We suggest that the regulatory features of two natural systems, deep-sea tubeworms with their microbial consortia and a soil ecosystem manifesting adaptive homeostatic changes, can be embedded within niche-model food-web dynamics.

  7. Conservation of Tcrg-V5 and limited allelic sequence polymorphism of the other Tcrg-V genes used by mouse tissue-specific gd-T lymphocytes

    Energy Technology Data Exchange (ETDEWEB)

    Roger, T.; Morisset, J.; Seman, M. [Universite Denis Diderot, Paris (France)

    1996-12-31

    The mouse Tcrg locus comprises seven Tcrg-V, four Tcrg-J, and four Tcrg-C segments which generate only six major types of functional g chains, Vg7-, Vg4-, Vg6-, or Vg5-Jg1-Cg1, Vg2-Jg2-Cg2, and Vg1-Jg4-Cg4. A complete analysis of restriction fragment length polymorphism (RFLP) of the Tcrg locus in wild and inbred mice suggested its relative conservation compared to other loci of the immunoglobulin (Ig) gene family. Three haplotypes have been characterized in laboratory mice: gA, gB, and gC, represented by BALB/c, DBA/2, and AKR prototypes. Tcr-gA and -gC haplotypes are highly related. By contrast, Tcr-gB, likely inherited from Asian mouse subspecies, appeared very different by RFLP analysis. Yet only partial sequence data have been reported on gA and gB Tcrg-V genes. Here, the complete sequence of all Tcrg-V genes of the two haplotypes is described. 16 refs., 1 fig.

  8. Pain acceptance, psychological functioning, and self-regulatory fatigue in temporomandibular disorder.

    Science.gov (United States)

    Eisenlohr-Moul, Tory A; Burris, Jessica L; Evans, Daniel R

    2013-12-01

    A growing body of evidence suggests that chronic pain patients suffer from chronic self-regulatory fatigue: difficulty controlling thoughts, emotions, and behavior. Pain acceptance, which involves responding to pain and related experiences without attempts to control or avoid them (pain willingness), and pursuit of valued life activities regardless of pain (activity engagement) has been associated with various favorable outcomes in chronic pain patients, including better psychological functioning. The study presented here tested the hypotheses that pain acceptance is associated with less psychological distress, higher psychological well-being, and reduced self-regulatory fatigue in temporomandibular disorder (TMD) patients, particularly for those with longer pain duration. Cross-sectional data were provided by 135 TMD patients during an initial evaluation at a university-based tertiary orofacial pain clinic. Results of hierarchical linear regression models indicated that, controlling for pain severity, pain willingness is associated with less psychological distress and lower self-regulatory fatigue, and activity engagement is associated with greater psychological well-being. Furthermore, the effect of pain willingness on psychological distress was moderated by pain duration such that pain willingness was more strongly associated with less psychological distress in patients with longer pain duration; this moderating effect was fully mediated by self-regulatory fatigue. These findings suggest pain willingness may buffer against self-regulatory fatigue in those with longer pain duration, and such conservation of self-regulatory resources may protect against psychological symptoms.

  9. Elucidating MicroRNA Regulatory Networks Using Transcriptional, Post-transcriptional, and Histone Modification Measurements

    Directory of Open Access Journals (Sweden)

    Sara J.C. Gosline

    2016-01-01

    Full Text Available MicroRNAs (miRNAs regulate diverse biological processes by repressing mRNAs, but their modest effects on direct targets, together with their participation in larger regulatory networks, make it challenging to delineate miRNA-mediated effects. Here, we describe an approach to characterizing miRNA-regulatory networks by systematically profiling transcriptional, post-transcriptional and epigenetic activity in a pair of isogenic murine fibroblast cell lines with and without Dicer expression. By RNA sequencing (RNA-seq and CLIP (crosslinking followed by immunoprecipitation sequencing (CLIP-seq, we found that most of the changes induced by global miRNA loss occur at the level of transcription. We then introduced a network modeling approach that integrated these data with epigenetic data to identify specific miRNA-regulated transcription factors that explain the impact of miRNA perturbation on gene expression. In total, we demonstrate that combining multiple genome-wide datasets spanning diverse regulatory modes enables accurate delineation of the downstream miRNA-regulated transcriptional network and establishes a model for studying similar networks in other systems.

  10. 77 FR 38795 - Dolores Water Conservancy District; Notice of Competing Preliminary Permit Application Accepted...

    Science.gov (United States)

    2012-06-29

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14426-000] Dolores Water... Comments and Motions To Intervene On May 10, 2012, Dolores Water Conservancy District, Colorado, filed an... the Plateau Creek Pumped Storage Project to be located on Plateau Creek, near the town of Dolores...

  11. 77 FR 35377 - Dolores Water Conservancy District; Notice of Completing Preliminary Permit Application Accepted...

    Science.gov (United States)

    2012-06-13

    ... DEPARTMENT OF ENERGY Federal Energy Regulatory Commission [Project No. 14328-000] Dolores Water... Comments and Motions To Intervene On May 10, 2012, Dolores Water Conservancy District, Colorado, filed an... the Plateau Creek Pumped Storage Project to be located on Plateau Creek, near the town of Dolores...

  12. Elements in the transcriptional regulatory region flanking herpes simplex virus type 1 oriS stimulate origin function.

    Science.gov (United States)

    Wong, S W; Schaffer, P A

    1991-05-01

    Like other DNA-containing viruses, the three origins of herpes simplex virus type 1 (HSV-1) DNA replication are flanked by sequences containing transcriptional regulatory elements. In a transient plasmid replication assay, deletion of sequences comprising the transcriptional regulatory elements of ICP4 and ICP22/47, which flank oriS, resulted in a greater than 80-fold decrease in origin function compared with a plasmid, pOS-822, which retains these sequences. In an effort to identify specific cis-acting elements responsible for this effect, we conducted systematic deletion analysis of the flanking region with plasmid pOS-822 and tested the resulting mutant plasmids for origin function. Stimulation by cis-acting elements was shown to be both distance and orientation dependent, as changes in either parameter resulted in a decrease in oriS function. Additional evidence for the stimulatory effect of flanking sequences on origin function was demonstrated by replacement of these sequences with the cytomegalovirus immediate-early promoter, resulting in nearly wild-type levels of oriS function. In competition experiments, cotransfection of cells with the test plasmid, pOS-822, and increasing molar concentrations of a competitor plasmid which contained the ICP4 and ICP22/47 transcriptional regulatory regions but lacked core origin sequences resulted in a significant reduction in the replication efficiency of pOS-822, demonstrating that factors which bind specifically to the oriS-flanking sequences are likely involved as auxiliary proteins in oriS function. Together, these studies demonstrate that trans-acting factors and the sites to which they bind play a critical role in the efficiency of HSV-1 DNA replication from oriS in transient-replication assays.

  13. Feather development genes and associated regulatory innovation predate the origin of Dinosauria.

    Science.gov (United States)

    Lowe, Craig B; Clarke, Julia A; Baker, Allan J; Haussler, David; Edwards, Scott V

    2015-01-01

    The evolution of avian feathers has recently been illuminated by fossils and the identification of genes involved in feather patterning and morphogenesis. However, molecular studies have focused mainly on protein-coding genes. Using comparative genomics and more than 600,000 conserved regulatory elements, we show that patterns of genome evolution in the vicinity of feather genes are consistent with a major role for regulatory innovation in the evolution of feathers. Rates of innovation at feather regulatory elements exhibit an extended period of innovation with peaks in the ancestors of amniotes and archosaurs. We estimate that 86% of such regulatory elements and 100% of the nonkeratin feather gene set were present prior to the origin of Dinosauria. On the branch leading to modern birds, we detect a strong signal of regulatory innovation near insulin-like growth factor binding protein (IGFBP) 2 and IGFBP5, which have roles in body size reduction, and may represent a genomic signature for the miniaturization of dinosaurian body size preceding the origin of flight. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. CLONING AND SEQUENCING OF PGIP FROM ‘JIN SERIES’ ALMOND (PRUNUS DULCIS

    Directory of Open Access Journals (Sweden)

    Yuhu Han

    2015-12-01

    Full Text Available Specific primers synthesized according to conservative regions of polygalacturonase inhibiting protein (PGIP gene were used to amplify Prunus Dulcis genomic DNA by polymerase-chain reaction (PCR. Six bands (pgip1, pgip2, pgip3, pgip4, pgip5 and pgip6 of genes were obtained and cloned into PBS-T vector. According to the length of bands, 717bp, 864bp, 796bp were A1 (pgip1, pgip2, pgip3, A2 (pgip4, A4 (pgip5, pgip6, respectively. DNA sequences showed that the fragments taken together were the gene encoding PGIP. A2 and A3 contained two exons interrupted by one intron, which has GT-AG sequence. Its DNA and amino acid sequences were highly homologies to those from Prunus Persica; Prunus Salicina; Prunus Americana; Prunus Mume, respectively. A conserved lencinerial fragment exists in the derived protein sequence.

  15. Conservation and divergence of ADAM family proteins in the Xenopus genome

    Directory of Open Access Journals (Sweden)

    Shah Anoop

    2010-07-01

    Full Text Available Abstract Background Members of the disintegrin metalloproteinase (ADAM family play important roles in cellular and developmental processes through their functions as proteases and/or binding partners for other proteins. The amphibian Xenopus has long been used as a model for early vertebrate development, but genome-wide analyses for large gene families were not possible until the recent completion of the X. tropicalis genome sequence and the availability of large scale expression sequence tag (EST databases. In this study we carried out a systematic analysis of the X. tropicalis genome and uncovered several interesting features of ADAM genes in this species. Results Based on the X. tropicalis genome sequence and EST databases, we identified Xenopus orthologues of mammalian ADAMs and obtained full-length cDNA clones for these genes. The deduced protein sequences, synteny and exon-intron boundaries are conserved between most human and X. tropicalis orthologues. The alternative splicing patterns of certain Xenopus ADAM genes, such as adams 22 and 28, are similar to those of their mammalian orthologues. However, we were unable to identify an orthologue for ADAM7 or 8. The Xenopus orthologue of ADAM15, an active metalloproteinase in mammals, does not contain the conserved zinc-binding motif and is hence considered proteolytically inactive. We also found evidence for gain of ADAM genes in Xenopus as compared to other species. There is a homologue of ADAM10 in Xenopus that is missing in most mammals. Furthermore, a single scaffold of X. tropicalis genome contains four genes encoding ADAM28 homologues, suggesting genome duplication in this region. Conclusions Our genome-wide analysis of ADAM genes in X. tropicalis revealed both conservation and evolutionary divergence of these genes in this amphibian species. On the one hand, all ADAMs implicated in normal development and health in other species are conserved in X. tropicalis. On the other hand, some

  16. Using genomic information to conserve genetic diversity in livestock

    NARCIS (Netherlands)

    Eynard, Sonia E.

    2018-01-01

    Concern about the status of livestock breeds and their conservation has increased as selection and small population sizes caused loss of genetic diversity. Meanwhile, dense SNP chips and whole genome sequences (WGS) became available, providing opportunities to accurately quantify the impact of

  17. Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

    Directory of Open Access Journals (Sweden)

    Girgis Hani Z

    2012-02-01

    Full Text Available Abstract Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF binding sites (TFBSs. Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was

  18. Characterizing leader sequences of CRISPR loci

    DEFF Research Database (Denmark)

    Alkhnbashi, Omer; Shah, Shiraz Ali; Garrett, Roger Antony

    2016-01-01

    The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR...... array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs...... sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from...

  19. Cross-species conservation of endocrine pathways provides a basis for reevaluation of EDSP tiered testing paradigm

    Science.gov (United States)

    Many structural and functional aspects of the vertebrate hypothalamic-pituitary-gonadal (HPG) axis are known to be highly conserved, but the relative significance of this from a regulatory toxicology perspective has received comparatively little attention. High-quality data gene...

  20. Identification and characterization of microRNAs from peanut (Arachis hypogaea L. by high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are noncoding RNAs of approximately 21 nt that regulate gene expression in plants post-transcriptionally by endonucleolytic cleavage or translational inhibition. miRNAs play essential roles in numerous developmental and physiological processes and many of them are conserved across species. Extensive studies of miRNAs have been done in a few model plants; however, less is known about the diversity of these regulatory RNAs in peanut (Arachis hypogaea L., one of the most important oilseed crops cultivated worldwide. RESULTS: A library of small RNA from peanut was constructed for deep sequencing. In addition to 126 known miRNAs from 33 families, 25 novel peanut miRNAs were identified. The miRNA* sequences of four novel miRNAs were discovered, providing additional evidence for the existence of miRNAs. Twenty of the novel miRNAs were considered to be species-specific because no homolog has been found for other plant species. qRT-PCR was used to analyze the expression of seven miRNAs in different tissues and in seed at different developmental stages and some showed tissue- and/or growth stage-specific expression. Furthermore, potential targets of these putative miRNAs were predicted on the basis of the sequence homology search. CONCLUSIONS: We have identified large numbers of miRNAs and their related target genes through deep sequencing of a small RNA library. This study of the identification and characterization of miRNAs in peanut can initiate further study on peanut miRNA regulation mechanisms, and help toward a greater understanding of the important roles of miRNAs in peanut.

  1. Temporary wetlands: Challenges and solutions to conserving a ‘disappearing’ ecosystem

    Science.gov (United States)

    Calhoun, Aram J.K.; Mushet, David M.; Bell, Kathleen P.; Boix, Dani; Fitzsimons, James A.; Isselin-Nondedeu, Francis

    2017-01-01

    Frequent drying of ponded water, and support of unique, highly specialized assemblages of often rare species, characterize temporary wetlands, such as vernal pools, gilgais, and prairie potholes. As small aquatic features embedded in a terrestrial landscape, temporary wetlands enhance biodiversity and provide aesthetic, biogeochemical, and hydrologic functions. Challenges to conserving temporary wetlands include the need to: (1) integrate freshwater and terrestrial biodiversity priorities; (2) conserve entire ‘pondscapes’ defined by connections to other aquatic and terrestrial systems; (3) maintain natural heterogeneity in environmental gradients across and within wetlands, especially gradients in hydroperiod; (4) address economic impact on landowners and developers; (5) act without complete inventories of these wetlands; and (6) work within limited or non-existent regulatory protections. Because temporary wetlands function as integral landscape components, not singly as isolated entities, their cumulative loss is ecologically detrimental yet not currently part of the conservation calculus. We highlight approaches that use strategies for conserving temporary wetlands in increasingly human-dominated landscapes that integrate top-down management and bottom-up collaborative approaches. Diverse conservation activities (including education, inventory, protection, sustainable management, and restoration) that reduce landowner and manager costs while achieving desired ecological objectives will have the greatest probability of success in meeting conservation goals.

  2. Identification of putative regulatory motifs in the upstream regions of co-expressed functional groups of genes in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Joshi NV

    2009-01-01

    Full Text Available Abstract Background Regulation of gene expression in Plasmodium falciparum (Pf remains poorly understood. While over half the genes are estimated to be regulated at the transcriptional level, few regulatory motifs and transcription regulators have been found. Results The study seeks to identify putative regulatory motifs in the upstream regions of 13 functional groups of genes expressed in the intraerythrocytic developmental cycle of Pf. Three motif-discovery programs were used for the purpose, and motifs were searched for only on the gene coding strand. Four motifs – the 'G-rich', the 'C-rich', the 'TGTG' and the 'CACA' motifs – were identified, and zero to all four of these occur in the 13 sets of upstream regions. The 'CACA motif' was absent in functional groups expressed during the ring to early trophozoite transition. For functional groups expressed in each transition, the motifs tended to be similar. Upstream motifs in some functional groups showed 'positional conservation' by occurring at similar positions relative to the translational start site (TLS; this increases their significance as regulatory motifs. In the ribonucleotide synthesis, mitochondrial, proteasome and organellar translation machinery genes, G-rich, C-rich, CACA and TGTG motifs, respectively, occur with striking positional conservation. In the organellar translation machinery group, G-rich motifs occur close to the TLS. The same motifs were sometimes identified for multiple functional groups; differences in location and abundance of the motifs appear to ensure different modes of action. Conclusion The identification of positionally conserved over-represented upstream motifs throws light on putative regulatory elements for transcription in Pf.

  3. Predicting effects of noncoding variants with deep learning-based sequence model.

    Science.gov (United States)

    Zhou, Jian; Troyanskaya, Olga G

    2015-10-01

    Identifying functional effects of noncoding variants is a major challenge in human genetics. To predict the noncoding-variant effects de novo from sequence, we developed a deep learning-based algorithmic framework, DeepSEA (http://deepsea.princeton.edu/), that directly learns a regulatory sequence code from large-scale chromatin-profiling data, enabling prediction of chromatin effects of sequence alterations with single-nucleotide sensitivity. We further used this capability to improve prioritization of functional variants including expression quantitative trait loci (eQTLs) and disease-associated variants.

  4. Pro-protein convertases control the maturation and processing of the iron-regulatory protein, RGMc/hemojuvelin

    Directory of Open Access Journals (Sweden)

    Rotwein Peter

    2008-04-01

    Full Text Available Abstract Background Repulsive guidance molecule c (RGMc or hemojuvelin, a glycosylphosphatidylinositol-linked glycoprotein expressed in liver and striated muscle, plays a central role in systemic iron balance. Inactivating mutations in the RGMc gene cause juvenile hemochromatosis (JH, a rapidly progressing iron storage disorder with severe systemic manifestations. RGMc undergoes complex biosynthetic steps leading to membrane-bound and soluble forms of the protein, including both 50 and 40 kDa single-chain species. Results We now show that pro-protein convertases (PC are responsible for conversion of 50 kDa RGMc to a 40 kDa protein with a truncated COOH-terminus. Unlike related molecules RGMa and RGMb, RGMc encodes a conserved PC recognition and cleavage site, and JH-associated RGMc frame-shift mutants undergo COOH-terminal cleavage only if this site is present. A cell-impermeable peptide PC inhibitor blocks the appearance of 40 kDa RGMc in extra-cellular fluid, as does an engineered mutation in the conserved PC recognition sequence, while the PC furin cleaves 50 kDa RGMc in vitro into a 40 kDa molecule with an intact NH2-terminus. Iron loading reduces release of RGMc from the cell membrane, and diminishes accumulation of the 40 kDa species in cell culture medium. Conclusion Our results define a role for PCs in the maturation of RGMc that may have implications for the physiological actions of this critical iron-regulatory protein.

  5. Repetitive Elements in Mycoplasma hyopneumoniae Transcriptional Regulation.

    Directory of Open Access Journals (Sweden)

    Amanda Malvessi Cattani

    Full Text Available Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.

  6. Repetitive Elements in Mycoplasma hyopneumoniae Transcriptional Regulation.

    Science.gov (United States)

    Cattani, Amanda Malvessi; Siqueira, Franciele Maboni; Guedes, Rafael Lucas Muniz; Schrank, Irene Silveira

    2016-01-01

    Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.

  7. Evidence for multiple major histocompatibility class II X-box binding proteins.

    OpenAIRE

    Celada, A; Maki, R

    1989-01-01

    The X box is a loosely conserved DNA sequence that is located upstream of all major histocompatibility class II genes and is one of the cis-acting regulatory elements. Despite the similarity between all X-box sequences, each promoter-proximal X box in the mouse appears to bind a separate nuclear factor.

  8. Sequence and Genetic Characterization of etrA, an fnr Analog that Regulates Anaerobic Respiration in Shewanella putrefaciens MR-1

    Science.gov (United States)

    Saffarini, Daad A.; Nelson, Kenneth H.

    1993-01-01

    An electron transport regulatory gene, etrA, has been isolated and characterized from the obligate respiratory bacterium Shewanella putrefaciens MR-l. The deduced amino acid sequence of etrA (EtrA) shows a high degree of identity to both the Fnr of Escherichia coli (73.6%) and the analogous protein (ANR) of Pseudomonas aeruginosa (50.8%). The four active cysteine residues of Fnr are conserved in EtrA, and the amino acid sequence of the DNA-binding domains of the two proteins are identical. Further, S.putrefaciens etrA is able to complement an fnr mutant of E.coli. In contrast to fnr, there is no recognizable Fnr box upstream of the etrA sequence. Gene replacement etr.A mutants of MR-1 were deficient in growth on nitrite, thiosulfate, sulfite, trimethylamine-N-oxide, dimethyl sulfoxide, Fe(III), and fumarate, suggesting that EtrA is involved in the regulation of the corresponding reductase genes. However, the mutants were all positive for reduction of and growth on nitrate and Mn(IV), indicating that EtrA is not involved in the regulation of these two systems. Southern blots of S.putrefaciens DNA with use of etrA as a probe revealed the expected etrA bands and a second set of hybridization signals whose genetic and functional properties remain to be determined.

  9. Mutation of miRNA target sequences during human evolution

    DEFF Research Database (Denmark)

    Gardner, Paul P; Vinther, Jeppe

    2008-01-01

    It has long-been hypothesized that changes in non-protein-coding genes and the regulatory sequences controlling expression could undergo positive selection. Here we identify 402 putative microRNA (miRNA) target sequences that have been mutated specifically in the human lineage and show that genes...... containing such deletions are more highly expressed than their mouse orthologs. Our findings indicate that some miRNA target mutations are fixed by positive selection and might have been involved in the evolution of human-specific traits....

  10. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  11. The cis-regulatory element CCACGTGG is involved in ABA and water-stress responses of the maize gene rab28.

    Science.gov (United States)

    Pla, M; Vilardell, J; Guiltinan, M J; Marcotte, W R; Niogret, M F; Quatrano, R S; Pagès, M

    1993-01-01

    The maize gene rab28 has been identified as ABA-inducible in embryos and vegetative tissues. It is also induced by water stress in young leaves. The proximal promoter region contains the conserved cis-acting element CCACGTGG (ABRE) reported for ABA induction in other plant genes. Transient expression assays in rice protoplasts indicate that a 134 bp fragment (-194 to -60 containing the ABRE) fused to a truncated cauliflower mosaic virus promoter (35S) is sufficient to confer ABA-responsiveness upon the GUS reporter gene. Gel retardation experiments indicate that nuclear proteins from tissues in which the rab28 gene is expressed can interact specifically with this 134 bp DNA fragment. Nuclear protein extracts from embryo and water-stressed leaves generate specific complexes of different electrophoretic mobility which are stable in the presence of detergent and high salt. However, by DMS footprinting the same guanine-specific contacts with the ABRE in both the embryo and leaf binding activities were detected. These results indicate that the rab28 promoter sequence CCACGTGG is a functional ABA-responsive element, and suggest that distinct regulatory factors with apparent similar affinity for the ABRE sequence may be involved in the hormone action during embryo development and in vegetative tissues subjected to osmotic stress.

  12. "Transcriptomics": molecular diagnosis of inborn errors of metabolism via RNA-sequencing.

    Science.gov (United States)

    Kremer, Laura S; Wortmann, Saskia B; Prokisch, Holger

    2018-01-25

    Exome wide sequencing techniques have revolutionized molecular diagnostics in patients with suspected inborn errors of metabolism or neuromuscular disorders. However, the diagnostic yield of 25-60% still leaves a large fraction of individuals without a diagnosis. This indicates a causative role for non-exonic regulatory variants not covered by whole exome sequencing. Here we review how systematic RNA-sequencing analysis (RNA-seq, "transcriptomics") lead to a molecular diagnosis in 10-35% of patients in whom whole exome sequencing failed to do so. Importantly, RNA-sequencing based discoveries cannot only guide molecular diagnosis but might also unravel therapeutic intervention points such as antisense oligonucleotide treatment for splicing defects as recently reported for spinal muscular atrophy.

  13. Novel nonphosphorylated peptides with conserved sequences selectively bind to Grb7 SH2 domain with affinity comparable to its phosphorylated ligand.

    Directory of Open Access Journals (Sweden)

    Dan Zhang

    Full Text Available The Grb7 (growth factor receptor-bound 7 protein, a member of the Grb7 protein family, is found to be highly expressed in such metastatic tumors as breast cancer, esophageal cancer, liver cancer, etc. The src-homology 2 (SH2 domain in the C-terminus is reported to be mainly involved in Grb7 signaling pathways. Using the random peptide library, we identified a series of Grb7 SH2 domain-binding nonphosphorylated peptides in the yeast two-hybrid system. These peptides have a conserved GIPT/K/N sequence at the N-terminus and G/WD/IP at the C-terminus, and the region between the N-and C-terminus contains fifteen amino acids enriched with serines, threonines and prolines. The association between the nonphosphorylated peptides and the Grb7 SH2 domain occurred in vitro and ex vivo. When competing for binding to the Grb7 SH2 domain in a complex, one synthesized nonphosphorylated ligand, containing the twenty-two amino acid-motif sequence, showed at least comparable affinity to the phosphorylated ligand of ErbB3 in vitro, and its overexpression inhibited the proliferation of SK-BR-3 cells. Such nonphosphorylated peptides may be useful for rational design of drugs targeted against cancers that express high levels of Grb7 protein.

  14. Final Regulatory Determination for Special Wastes From Mineral Processing (Mining Waste Exclusion) - Federal Register Notice, June 13, 1991

    Science.gov (United States)

    This action presents the Agency's final regulatory determination required by section 3001(b)(3)(C) of the Resource Conservation and Recovery Act (RCRA) for 20 special wastes from the processing of ores and minerals.

  15. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  16. GntR family of regulators in Mycobacterium smegmatis: a sequence and structure based characterization

    Directory of Open Access Journals (Sweden)

    Ranjan Akash

    2007-08-01

    Full Text Available Abstract Background Mycobacterium smegmatis is fast growing non-pathogenic mycobacteria. This organism has been widely used as a model organism to study the biology of other virulent and extremely slow growing species like Mycobacterium tuberculosis. Based on the homology of the N-terminal DNA binding domain, the recently sequenced genome of M. smegmatis has been shown to possess several putative GntR regulators. A striking characteristic feature of this family of regulators is that they possess a conserved N-terminal DNA binding domain and a diverse C-terminal domain involved in the effector binding and/or oligomerization. Since the physiological role of these regulators is critically dependent upon effector binding and operator sites, we have analysed and classified these regulators into their specific subfamilies and identified their potential binding sites. Results The sequence analysis of M. smegmatis putative GntRs has revealed that FadR, HutC, MocR and the YtrA-like regulators are encoded by 45, 8, 8 and 1 genes respectively. Further out of 45 FadR-like regulators, 19 were classified into the FadR group and 26 into the VanR group. All these proteins showed similar secondary structural elements specific to their respective subfamilies except MSMEG_3959, which showed additional secondary structural elements. Using the reciprocal BLAST searches, we further identified the orthologs of these regulators in Bacillus subtilis and other mycobacteria. Since the expression of many regulators is auto-regulatory, we have identified potential operator sites for a number of these GntR regulators by analyzing the upstream sequences. Conclusion This study helps in extending the annotation of M. smegmatis GntR proteins. It identifies the GntR regulators of M. smegmatis that could serve as a model for studying orthologous regulators from virulent as well as other saprophytic mycobacteria. This study also sheds some light on the nucleotide preferences in the

  17. Evolutionary conservation of essential and highly expressed genes in Pseudomonas aeruginosa

    Directory of Open Access Journals (Sweden)

    Scharfe Maren

    2010-04-01

    Full Text Available Abstract Background The constant increase in development and spread of bacterial resistance to antibiotics poses a serious threat to human health. New sequencing technologies are now on the horizon that will yield massive increases in our capacity for DNA sequencing and will revolutionize the drug discovery process. Since essential genes are promising novel antibiotic targets, the prediction of gene essentiality based on genomic information has become a major focus. Results In this study we demonstrate that pooled sequencing is applicable for the analysis of sequence variations of strain collections with more than 10 individual isolates. Pooled sequencing of 36 clinical Pseudomonas aeruginosa isolates revealed that essential and highly expressed proteins evolve at lower rates, whereas extracellular proteins evolve at higher rates. We furthermore refined the list of experimentally essential P. aeruginosa genes, and identified 980 genes that show no sequence variation at all. Among the conserved nonessential genes we found several that are involved in regulation, motility and virulence, indicating that they represent factors of evolutionary importance for the lifestyle of a successful environmental bacterium and opportunistic pathogen. Conclusion The detailed analysis of a comprehensive set of P. aeruginosa genomes in this study clearly disclosed detailed information of the genomic makeup and revealed a large set of highly conserved genes that play an important role for the lifestyle of this microorganism. Sequencing strain collections enables for a detailed and extensive identification of sequence variations as potential bacterial adaptation processes, e.g., during the development of antibiotic resistance in the clinical setting and thus may be the basis to uncover putative targets for novel treatment strategies.

  18. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    DEFF Research Database (Denmark)

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  19. Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.

    Science.gov (United States)

    van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J

    2017-10-01

    Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is

  20. Comparative analysis of the full genome sequence of European bat lyssavirus type 1 and type 2 with other lyssaviruses and evidence for a conserved transcription termination and polyadenylation motif in the G-L 3' non-translated region.

    Science.gov (United States)

    Marston, D A; McElhinney, L M; Johnson, N; Müller, T; Conzelmann, K K; Tordo, N; Fooks, A R

    2007-04-01

    We report the first full-length genomic sequences for European bat lyssavirus type-1 (EBLV-1) and type-2 (EBLV-2). The EBLV-1 genomic sequence was derived from a virus isolated from a serotine bat in Hamburg, Germany, in 1968 and the EBLV-2 sequence was derived from a virus isolate from a human case of rabies that occurred in Scotland in 2002. A long-distance PCR strategy was used to amplify the open reading frames (ORFs), followed by standard and modified RACE (rapid amplification of cDNA ends) techniques to amplify the 3' and 5' ends. The lengths of each complete viral genome for EBLV-1 and EBLV-2 were 11 966 and 11 930 base pairs, respectively, and follow the standard rhabdovirus genome organization of five viral proteins. Comparison with other lyssavirus sequences demonstrates variation in degrees of homology, with the genomic termini showing a high degree of complementarity. The nucleoprotein was the most conserved, both intra- and intergenotypically, followed by the polymerase (L), matrix and glyco- proteins, with the phosphoprotein being the most variable. In addition, we have shown that the two EBLVs utilize a conserved transcription termination and polyadenylation (TTP) motif, approximately 50 nt upstream of the L gene start codon. All available lyssavirus sequences to date, with the exception of Pasteur virus (PV) and PV-derived isolates, use the second TTP site. This observation may explain differences in pathogenicity between lyssavirus strains, dependent on the length of the untranslated region, which might affect transcriptional activity and RNA stability.