WorldWideScience

Sample records for ancient gene network

  1. The PNarec method for detection of ancient recombinations through phylogenetic network analysis.

    Science.gov (United States)

    Saitou, Naruya; Kitano, Takashi

    2013-02-01

    Recombinations are known to disrupt bifurcating tree structure of gene genealogies. Although recently occurred recombinations are easily detectable by using conventional methods, recombinations may have occurred at any time. We devised a new method for detecting ancient recombinations through phylogenetic network analysis, and detected five ancient recombinations in gibbon ABO blood group genes [Kitano et al., 2009. Mol. Phylogenet. Evol., 51, 465-471]. We present applications of this method, now named as "PNarec", to various virus sequences as well as HLA genes.

  2. Fire usage and ancient hominin detoxification genes

    NARCIS (Netherlands)

    Aarts, Jac M.M.J.G.; Alink, Gerrit M.; Scherjon, Fulco; MacDonald, Katharine; Smith, Alison C.; Nijveen, Harm; Roebroeks, Wil

    2016-01-01

    Studies of the defence capacity of ancient hominins against toxic substances may contribute importantly to the reconstruction of their niche, including their diets and use of fire. Fire usage implies frequent exposure to hazardous compounds from smoke and heated food, known to affect general heal

  3. Examining Ancient Inter-domain Horizontal Gene Transfer

    Directory of Open Access Journals (Sweden)

    Francisca C. Almeida

    2008-01-01

    Full Text Available Details of the genomic changes that occurred in the ancestors of Eukarya, Archaea and Bacteria are elusive. Ancient interdomain horizontal gene transfer (IDHGT amongst the ancestors of these three domains has been difficult to detect and analyze because of the extreme degree of divergence of genes in these three domains and because most evidence for such events are poorly supported. In addition, many researchers have suggested that the prevalence of IDHGT events early in the evolution of life would most likely obscure the patterns of divergence of major groups of organisms let alone allow the tracking of horizontal transfer at this level. In order to approach this problem, we mined the E. coli genome for genes with distinct paralogs. Using the 1,268 E. coli K-12 genes with 40% or higher similarity level to a paralog elsewhere in the E. coli genome we detected 95 genes found exclusively in Bacteria and Archaea and 86 genes found in Bacteria and Eukarya. These genes form the basis for our analysis of IDHGT. We also applied a newly developed statistical test (the node height test, to examine the robustness of these inferences and to corroborate the phylogenetically identifi ed cases of ancient IDHGT. Our results suggest that ancient inter domain HGT is restricted to special cases, mostly involving symbiosis in eukaryotes and specific adaptations in prokaryotes. Only three genes in the Bacteria + Eukarya class (Deoxyxylulose-5-phosphate synthase (DXPS, fructose 1,6-phosphate aldolase class II protein and glucosamine-6-phosphate deaminase and three genes–in the Bacteria + Archaea class (ABC-type FE3+ -siderophore transport system, ferrous iron transport protein B, and dipeptide transport protein showed evidence of ancient IDHGT. However, we conclude that robust estimates of IDHGT will be very difficult to obtain due to the methodological limitations and the extreme sequence saturation of the genes suspected of being involved in IDHGT.

  4. Characterization of an ancient lepidopteran lateral gene transfer.

    Directory of Open Access Journals (Sweden)

    David Wheeler

    Full Text Available Bacteria to eukaryote lateral gene transfers (LGT are an important potential source of material for the evolution of novel genetic traits. The explosion in the number of newly sequenced genomes provides opportunities to identify and characterize examples of these lateral gene transfer events, and to assess their role in the evolution of new genes. In this paper, we describe an ancient lepidopteran LGT of a glycosyl hydrolase family 31 gene (GH31 from an Enterococcus bacteria. PCR amplification between the LGT and a flanking insect gene confirmed that the GH31 was integrated into the Bombyx mori genome and was not a result of an assembly error. Database searches in combination with degenerate PCR on a panel of 7 lepidopteran families confirmed that the GH31 LGT event occurred deep within the Order approximately 65-145 million years ago. The most basal species in which the LGT was found is Plutella xylostella (superfamily: Yponomeutoidea. Array data from Bombyx mori shows that GH31 is expressed, and low dN/dS ratios indicates the LGT coding sequence is under strong stabilizing selection. These findings provide further support for the proposition that bacterial LGTs are relatively common in insects and likely to be an underappreciated source of adaptive genetic material.

  5. Introduction: Cancer Gene Networks.

    Science.gov (United States)

    Clarke, Robert

    2017-01-01

    Constructing, evaluating, and interpreting gene networks generally sits within the broader field of systems biology, which continues to emerge rapidly, particular with respect to its application to understanding the complexity of signaling in the context of cancer biology. For the purposes of this volume, we take a broad definition of systems biology. Considering an organism or disease within an organism as a system, systems biology is the study of the integrated and coordinated interactions of the network(s) of genes, their variants both natural and mutated (e.g., polymorphisms, rearrangements, alternate splicing, mutations), their proteins and isoforms, and the organic and inorganic molecules with which they interact, to execute the biochemical reactions (e.g., as enzymes, substrates, products) that reflect the function of that system. Central to systems biology, and perhaps the only approach that can effectively manage the complexity of such systems, is the building of quantitative multiscale predictive models. The predictions of the models can vary substantially depending on the nature of the model and its inputoutput relationships. For example, a model may predict the outcome of a specific molecular reaction(s), a cellular phenotype (e.g., alive, dead, growth arrest, proliferation, and motility), a change in the respective prevalence of cell or subpopulations, a patient or patient subgroup outcome(s). Such models necessarily require computers. Computational modeling can be thought of as using machine learning and related tools to integrate the very high dimensional data generated from modern, high throughput omics technologies including genomics (next generation sequencing), transcriptomics (gene expression microarrays; RNAseq), metabolomics and proteomics (ultra high performance liquid chromatography, mass spectrometry), and "subomic" technologies to study the kinome, methylome, and others. Mathematical modeling can be thought of as the use of ordinary

  6. Ancient expansion of the hox cluster in lepidoptera generated four homeobox genes implicated in extra-embryonic tissue formation.

    Science.gov (United States)

    Ferguson, Laura; Marlétaz, Ferdinand; Carter, Jean-Michel; Taylor, William R; Gibbs, Melanie; Breuker, Casper J; Holland, Peter W H

    2014-10-01

    Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks.

  7. Intrinsic challenges in ancient microbiome reconstruction using 16S rRNA gene amplification.

    Science.gov (United States)

    Ziesemer, Kirsten A; Mann, Allison E; Sankaranarayanan, Krithivasan; Schroeder, Hannes; Ozga, Andrew T; Brandt, Bernd W; Zaura, Egija; Waters-Rist, Andrea; Hoogland, Menno; Salazar-García, Domingo C; Aldenderfer, Mark; Speller, Camilla; Hendy, Jessica; Weston, Darlene A; MacDonald, Sandy J; Thomas, Gavin H; Collins, Matthew J; Lewis, Cecil M; Hofman, Corinne; Warinner, Christina

    2015-11-13

    To date, characterization of ancient oral (dental calculus) and gut (coprolite) microbiota has been primarily accomplished through a metataxonomic approach involving targeted amplification of one or more variable regions in the 16S rRNA gene. Specifically, the V3 region (E. coli 341-534) of this gene has been suggested as an excellent candidate for ancient DNA amplification and microbial community reconstruction. However, in practice this metataxonomic approach often produces highly skewed taxonomic frequency data. In this study, we use non-targeted (shotgun metagenomics) sequencing methods to better understand skewed microbial profiles observed in four ancient dental calculus specimens previously analyzed by amplicon sequencing. Through comparisons of microbial taxonomic counts from paired amplicon (V3 U341F/534R) and shotgun sequencing datasets, we demonstrate that extensive length polymorphisms in the V3 region are a consistent and major cause of differential amplification leading to taxonomic bias in ancient microbiome reconstructions based on amplicon sequencing. We conclude that systematic amplification bias confounds attempts to accurately reconstruct microbiome taxonomic profiles from 16S rRNA V3 amplicon data generated using universal primers. Because in silico analysis indicates that alternative 16S rRNA hypervariable regions will present similar challenges, we advocate for the use of a shotgun metagenomics approach in ancient microbiome reconstructions.

  8. Intrinsic challenges in ancient microbiome reconstruction using 16S rRNA gene amplification

    NARCIS (Netherlands)

    Ziesemer, K.A.; Mann, A.E.; Sankaranarayanan, K.; Schroeder, H.; Ozga, A.T.; Brandt, B.W.; Zaura, E.; Waters-Rist, A.; Hoogland, M.; Salazar-García, D.C.; Aldenderfer, M.; Speller, C.; Hendy, J.; Weston, D.A.; MacDonald, S.J.; Thomas, G.H.; Collins, M.J.; Lewis, C.M.; Hofman, C.; Warinner, C.

    2015-01-01

    To date, characterization of ancient oral (dental calculus) and gut (coprolite) microbiota has been primarily accomplished through a metataxonomic approach involving targeted amplification of one or more variable regions in the 16S rRNA gene. Specifically, the V3 region (E. coli 341-534) of this gen

  9. Maths Meets Myths: Network Investigations of Ancient Narratives

    Science.gov (United States)

    Kenna, Ralph; Mac Carron, Pádraig

    2016-02-01

    Three years ago, we initiated a programme of research in which ideas and tools from statistical physics and network theory were applied to the field of comparative mythology. The eclecticism of the work, together with the perspectives it delivered, led to widespread media coverage and academic discussion. Here we review some aspects of the project, contextualised with a brief history of the long relationship between science and the humanities. We focus in particular on an Irish epic, summarising some of the outcomes of our quantitative investigation. We also describe the emergence of a new sub-discipline and our hopes for its future.

  10. Maths Meets Myths: Network Investigations of Ancient Narratives

    CERN Document Server

    Kenna, R

    2015-01-01

    Three years ago, we initiated a programme of research in which ideas and tools from statistical physics and network theory were applied to the field of comparative mythology. The eclecticism of the work, together with the perspectives it delivered, led to widespread media coverage and academic discussion. Here we review some aspects of the project, contextualised with a brief history of the long relationship between science and the humanities. We focus in particular on an Irish epic, summarising some of the outcomes of our quantitative investigation. We also describe the emergence of a new sub-discipline and our hopes for its future.

  11. Logistic growth for the Nuzi cuneiform tablets: Analyzing family networks in ancient Mesopotamia

    Science.gov (United States)

    Ueda, Sumie; Makino, Kumi; Itoh, Yoshiaki; Tsuchiya, Takashi

    2015-03-01

    We reconstruct the published year of each cuneiform tablet of the Nuzi society in ancient Mesopotamia. The tablets are on land transaction, marriage, loan, slavery contracts, etc. The number of tablets seems to increase by logistic growth. It may show the dynamics of concentration of lands or other properties into few powerful families in a period of about sixty years and most of them are in about thirty years. We reconstruct family trees and social networks of Nuzi and estimate the published years of cuneiform tablets consistently with the trees and networks, formulating least squares problems with linear inequality constraints.

  12. Connecting Harbours. A comparison of traffic networks across ancient and medieval Europe

    CERN Document Server

    Preiser-Kapeller, Johannes

    2016-01-01

    Ancient and medieval harbours connected via navigable and terrestrial routes could be interpreted as elements of complex traffic networks. Based on evidence from three projects in Priority Programme 1630 (Fossa Carolina, Inland harbours in Central Europe and Byzantine harbours on the Balkan coasts) we present a pioneer study to apply concepts and tools of network theory on archaeological and on written evidence as well as to integrate this data into different network models. Our diachronic approach allows for an analysis of the temporal and spatial dynamics of webs of connectivity with a focus on the 1st millennium AD. The combination of case studies on various spatial scales as well as from regions of inland and maritime navigation (Central Europe respectively the Seas around the Balkans) allows for the identification of structural similarities respectively difference between pre-modern traffic systems across Europe. The contribution is a first step towards further adaptions of tools of network analysis as a...

  13. Establishing the validity of domestication genes using DNA from ancient chickens.

    Science.gov (United States)

    Girdland Flink, Linus; Allen, Richard; Barnett, Ross; Malmström, Helena; Peters, Joris; Eriksson, Jonas; Andersson, Leif; Dobney, Keith; Larson, Greger

    2014-04-29

    Modern domestic plants and animals are subject to human-driven selection for desired phenotypic traits and behavior. Large-scale genetic studies of modern domestic populations and their wild relatives have revealed not only the genetic mechanisms underlying specific phenotypic traits, but also allowed for the identification of candidate domestication genes. Our understanding of the importance of these genes during the initial stages of the domestication process traditionally rests on the assumption that robust inferences about the past can be made on the basis of modern genetic datasets. A growing body of evidence from ancient DNA studies, however, has revealed that ancient and even historic populations often bear little resemblance to their modern counterparts. Here, we test the temporal context of selection on specific genetic loci known to differentiate modern domestic chickens from their extant wild ancestors. We extracted DNA from 80 ancient chickens excavated from 12 European archaeological sites, dated from ∼ 280 B.C. to the 18th century A.D. We targeted three unlinked genetic loci: the mitochondrial control region, a gene associated with yellow skin color (β-carotene dioxygenase 2), and a putative domestication gene thought to be linked to photoperiod and reproduction (thyroid-stimulating hormone receptor, TSHR). Our results reveal significant variability in both nuclear genes, suggesting that the commonality of yellow skin in Western breeds and the near fixation of TSHR in all modern chickens took place only in the past 500 y. In addition, mitochondrial variation has increased as a result of recent admixture with exotic breeds. We conclude by emphasizing the perils of inferring the past from modern genetic data alone.

  14. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    Science.gov (United States)

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae).

  15. Current approaches to gene regulatory network modelling

    Directory of Open Access Journals (Sweden)

    Brazma Alvis

    2007-09-01

    Full Text Available Abstract Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model.

  16. Ancient horizontal gene transfer from bacteria enhances biosynthetic capabilities of fungi.

    Directory of Open Access Journals (Sweden)

    Imke Schmitt

    Full Text Available BACKGROUND: Polyketides are natural products with a wide range of biological functions and pharmaceutical applications. Discovery and utilization of polyketides can be facilitated by understanding the evolutionary processes that gave rise to the biosynthetic machinery and the natural product potential of extant organisms. Gene duplication and subfunctionalization, as well as horizontal gene transfer are proposed mechanisms in the evolution of biosynthetic gene clusters. To explain the amount of homology in some polyketide synthases in unrelated organisms such as bacteria and fungi, interkingdom horizontal gene transfer has been evoked as the most likely evolutionary scenario. However, the origin of the genes and the direction of the transfer remained elusive. METHODOLOGY/PRINCIPAL FINDINGS: We used comparative phylogenetics to infer the ancestor of a group of polyketide synthase genes involved in antibiotic and mycotoxin production. We aligned keto synthase domain sequences of all available fungal 6-methylsalicylic acid (6-MSA-type PKSs and their closest bacterial relatives. To assess the role of symbiotic fungi in the evolution of this gene we generated 24 6-MSA synthase sequence tags from lichen-forming fungi. Our results support an ancient horizontal gene transfer event from an actinobacterial source into ascomycete fungi, followed by gene duplication. CONCLUSIONS/SIGNIFICANCE: Given that actinobacteria are unrivaled producers of biologically active compounds, such as antibiotics, it appears particularly promising to study biosynthetic genes of actinobacterial origin in fungi. The large number of 6-MSA-type PKS sequences found in lichen-forming fungi leads us hypothesize that the evolution of typical lichen compounds, such as orsellinic acid derivatives, was facilitated by the gain of this bacterial polyketide synthase.

  17. Evolution of evolvability in gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Anton Crombach

    Full Text Available Gene regulatory networks are perhaps the most important organizational level in the cell where signals from the cell state and the outside environment are integrated in terms of activation and inhibition of genes. For the last decade, the study of such networks has been fueled by large-scale experiments and renewed attention from the theoretical field. Different models have been proposed to, for instance, investigate expression dynamics, explain the network topology we observe in bacteria and yeast, and for the analysis of evolvability and robustness of such networks. Yet how these gene regulatory networks evolve and become evolvable remains an open question. An individual-oriented evolutionary model is used to shed light on this matter. Each individual has a genome from which its gene regulatory network is derived. Mutations, such as gene duplications and deletions, alter the genome, while the resulting network determines the gene expression pattern and hence fitness. With this protocol we let a population of individuals evolve under Darwinian selection in an environment that changes through time. Our work demonstrates that long-term evolution of complex gene regulatory networks in a changing environment can lead to a striking increase in the efficiency of generating beneficial mutations. We show that the population evolves towards genotype-phenotype mappings that allow for an orchestrated network-wide change in the gene expression pattern, requiring only a few specific gene indels. The genes involved are hubs of the networks, or directly influencing the hubs. Moreover, throughout the evolutionary trajectory the networks maintain their mutational robustness. In other words, evolution in an alternating environment leads to a network that is sensitive to a small class of beneficial mutations, while the majority of mutations remain neutral: an example of evolution of evolvability.

  18. Inferring latent gene regulatory network kinetics

    NARCIS (Netherlands)

    González, Javier; Vujačić, Ivan; Wit, Ernst

    2013-01-01

    Regulatory networks consist of genes encoding transcription factors (TFs) and the genes they activate or repress. Various types of systems of ordinary differential equations (ODE) have been proposed to model these networks, ranging from linear to Michaelis-Menten approaches. In practice, a serious d

  19. The Drosophila melanogaster methuselah gene: a novel gene with ancient functions.

    Directory of Open Access Journals (Sweden)

    Ana Rita Araújo

    Full Text Available The Drosophila melanogaster G protein-coupled receptor gene, methuselah (mth, has been described as a novel gene that is less than 10 million years old. Nevertheless, it shows a highly specific expression pattern in embryos, larvae, and adults, and has been implicated in larval development, stress resistance, and in the setting of adult lifespan, among others. Although mth belongs to a gene subfamily with 16 members in D. melanogaster, there is no evidence for functional redundancy in this subfamily. Therefore, it is surprising that a novel gene influences so many traits. Here, we explore the alternative hypothesis that mth is an old gene. Under this hypothesis, in species distantly related to D. melanogaster, there should be a gene with features similar to those of mth. By performing detailed phylogenetic, synteny, protein structure, and gene expression analyses we show that the D. virilis GJ12490 gene is the orthologous of mth in species distantly related to D. melanogaster. We also show that, in D. americana (a species of the virilis group of Drosophila, a common amino acid polymorphism at the GJ12490 orthologous gene is significantly associated with developmental time, size, and lifespan differences. Our results imply that GJ12490 orthologous genes are candidates for developmental time and lifespan differences in Drosophila in general.

  20. Artificial Neural Network Model for Discrimination of Stability of Ancient Landslide in Impounding Area of Three Gorges Project, China

    Institute of Scientific and Technical Information of China (English)

    Zhou Pinggen

    2003-01-01

    The factors of geomorphology, geological setting, effect of ground water and environment dynamic factors (e. g. rainfall and artificial water recharge) should be integrated in the discrimination of the stability of the ancient landslide. As the criterion of landslide stability has been studied, the artificial neural network model was then applied to discriminate the stability of the ancient landslide in the impounding area of the Three Gorges project on the Yangtze River, China. The model has the property of self-adaptive identifying and integrating complex qualitative factors and quantitative factors. The results of the artificial neural network model are coincided well with what were gained by classical limit equilibrinm analysis (the Bishop method and Janbu method) and by other comprehensive discrimination methods.

  1. Modeling of hysteresis in gene regulatory networks.

    Science.gov (United States)

    Hu, J; Qin, K R; Xiang, C; Lee, T H

    2012-08-01

    Hysteresis, observed in many gene regulatory networks, has a pivotal impact on biological systems, which enhances the robustness of cell functions. In this paper, a general model is proposed to describe the hysteretic gene regulatory network by combining the hysteresis component and the transient dynamics. The Bouc-Wen hysteresis model is modified to describe the hysteresis component in the mammalian gene regulatory networks. Rigorous mathematical analysis on the dynamical properties of the model is presented to ensure the bounded-input-bounded-output (BIBO) stability and demonstrates that the original Bouc-Wen model can only generate a clockwise hysteresis loop while the modified model can describe both clockwise and counter clockwise hysteresis loops. Simulation studies have shown that the hysteresis loops from our model are consistent with the experimental observations in three mammalian gene regulatory networks and two E.coli gene regulatory networks, which demonstrate the ability and accuracy of the mathematical model to emulate natural gene expression behavior with hysteresis. A comparison study has also been conducted to show that this model fits the experiment data significantly better than previous ones in the literature. The successful modeling of the hysteresis in all the five hysteretic gene regulatory networks suggests that the new model has the potential to be a unified framework for modeling hysteresis in gene regulatory networks and provide better understanding of the general mechanism that drives the hysteretic function.

  2. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    Science.gov (United States)

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  3. Filtering Genes for Cluster and Network Analysis

    Directory of Open Access Journals (Sweden)

    Parkhomenko Elena

    2009-06-01

    Full Text Available Abstract Background Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias. Results This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks. Conclusion The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.

  4. An ancient repeat sequence in the ATP synthase beta-subunit gene of forcipulate sea stars.

    Science.gov (United States)

    Foltz, David W

    2007-11-01

    A novel repeat sequence with a conserved secondary structure is described from two nonadjacent introns of the ATP synthase beta-subunit gene in sea stars of the order Forcipulatida (Echinodermata: Asteroidea). The repeat is present in both introns of all forcipulate sea stars examined, which suggests that it is an ancient feature of this gene (with an approximate age of 200 Mya). Both stem and loop regions show high levels of sequence constraint when compared to flanking nonrepetitive intronic regions. The repeat was also detected in (1) the family Pterasteridae, order Velatida and (2) the family Korethrasteridae, order Velatida. The repeat was not detected in (1) the family Echinasteridae, order Spinulosida, (2) the family Astropectinidae, order Paxillosida, (3) the family Solasteridae, order Velatida, or (4) the family Goniasteridae, order Valvatida. The repeat lacks similarity to published sequences in unrestricted GenBank searches, and there are no significant open reading frames in the repeat or in the flanking intron sequences. Comparison via parametric bootstrapping to a published phylogeny based on 4.2 kb of nuclear and mitochondrial sequence for a subset of these species allowed the null hypothesis of a congruent phylogeny to be rejected for each repeat, when compared separately to the published phylogeny. In contrast, the flanking nonrepetitive sequences in each intron yielded separate phylogenies that were each congruent with the published phylogeny. In four species, the repeat in one or both introns has apparently experienced gene conversion. The two introns also show a correlated pattern of nucleotide substitutions, even after excluding the putative cases of gene conversion.

  5. Inferring gene networks from discrete expression data

    KAUST Repository

    Zhang, L.

    2013-07-18

    The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.

  6. Evolution of AGL6-like MADS box genes in grasses (Poaceae): ovule expression is ancient and palea expression is new.

    Science.gov (United States)

    Reinheimer, Renata; Kellogg, Elizabeth A

    2009-09-01

    AGAMOUS-like6 (AGL6) genes encode MIKC-type MADS box transcription factors and are closely related to SEPALLATA and AP1/FUL-like genes. Here, we focus on the molecular evolution and expression of the AGL6-like genes in grasses. We have found that AGL6-like genes are expressed in ovules, lodicules (second whorl floral organs), paleas (putative first whorl floral organs), and floral meristems. Each of these expression domains was acquired at a different time in evolution, indicating that each represents a distinct function of the gene product and that the AGL6-like genes are pleiotropic. Expression in the inner integument of the ovule appears to be an ancient expression pattern corresponding to the expression of the gene in the megasporangium and integument in gymnosperms. Expression in floral meristems appears to have been acquired in the angiosperms and expression in second whorl organs in monocots. Early in grass evolution, AGL6-like orthologs acquired a new expression domain in the palea. Stamen expression is variable. Most grasses have a single AGL6-like gene (orthologous to the rice [Oryza sativa] gene MADS6). However, rice and other species of Oryza have a second copy (orthologous to rice MADS17) that appears to be the result of an ancient duplication.

  7. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in ph...

  8. Network of tRNA Gene Sequences

    Institute of Scientific and Technical Information of China (English)

    WEI Fang-ping; LI Sheng; MA Hong-ru

    2008-01-01

    A network of 3719 tRNA gene sequences was constructed using simplest alignment. Its topology, degree distribution and clustering coefficient were studied. The behaviors of the network shift from fluctuated distribution to scale-free distribution when the similarity degree of the tRNA gene sequences increases. The tRNA gene sequences with the same anticodon identity are more self-organized than those with different anticodon identities and form local clusters in the network. Some vertices of the local cluster have a high connection with other local clusters, and the probable reason was given. Moreover, a network constructed by the same number of random tRNA sequences was used to make comparisons. The relationships between the properties of the tRNA similarity network and the characters of tRNA evolutionary history were discussed.

  9. Genes2FANs: connecting genes through functional association networks

    Directory of Open Access Journals (Sweden)

    Dannenfelser Ruth

    2012-07-01

    Full Text Available Abstract Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs, researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our

  10. Genes, Genomes, and Assemblages of Modern Anoxygenic Photosynthetic Cyanobacteria as Proxies for Ancient Cyanobacteria

    Science.gov (United States)

    Grim, S. L.; Dick, G.

    2015-12-01

    Oxygenic photosynthetic (OP) cyanobacteria were responsible for the production of O2 during the Proterozoic. However, the extent and degree of oxygenation of the atmosphere and oceans varied for over 2 Ga after OP cyanobacteria first appeared in the geologic record. Cyanobacteria capable of anoxygenic photosynthesis (AP) may have altered the trajectory of oxygenation, yet the scope of their role in the Proterozoic is not well known. Modern cyanobacterial populations from Middle Island Sinkhole (MIS), Michigan and a handful of cultured cyanobacterial strains, are capable of OP and AP. With their metabolic versatility, these microbes may approximate ancient cyanobacterial assemblages that mediated Earth's oxygenation. To better characterize the taxonomic and genetic signatures of these modern AP/OP cyanobacteria, we sequenced 16S rRNA genes and conducted 'omics analyses on cultured strains, lab mesocosms, and MIS cyanobacterial mat samples collected over multiple years from May to September. Diversity in the MIS cyanobacterial mat is low, with one member of Oscillatoriales dominating at all times. However, Planktothrix members are more abundant in the cyanobacterial community in late summer and fall. The shift in cyanobacterial community composition may be linked to seasonally changing light intensity. In lab mesocosms of MIS microbial mat, we observed a shift in dominant cyanobacterial groups as well as the emergence of Chlorobium, bacteria that specialize in AP. These shifts in microbial community composition and metabolism are likely in response to changing environmental parameters such as the availability of light and sulfide. Further research is needed to understand the impacts of the changing photosynthetic community on oxygen production and the entire microbial consortium. Our study connects genes and genomes of AP cyanobacteria to their environment, and improves understanding of cyanobacterial metabolic strategies that may have shaped Earth's redox evolution.

  11. Network Completion for Static Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Natsu Nakajima

    2014-01-01

    Full Text Available We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data.

  12. Modeling gene regulatory networks: A network simplification algorithm

    Science.gov (United States)

    Ferreira, Luiz Henrique O.; de Castro, Maria Clicia S.; da Silva, Fabricio A. B.

    2016-12-01

    Boolean networks have been used for some time to model Gene Regulatory Networks (GRNs), which describe cell functions. Those models can help biologists to make predictions, prognosis and even specialized treatment when some disturb on the GRN lead to a sick condition. However, the amount of information related to a GRN can be huge, making the task of inferring its boolean network representation quite a challenge. The method shown here takes into account information about the interactome to build a network, where each node represents a protein, and uses the entropy of each node as a key to reduce the size of the network, allowing the further inferring process to focus only on the main protein hubs, the ones with most potential to interfere in overall network behavior.

  13. Gene regulatory networks governing pancreas development.

    Science.gov (United States)

    Arda, H Efsun; Benitez, Cecil M; Kim, Seung K

    2013-04-15

    Elucidation of cellular and gene regulatory networks (GRNs) governing organ development will accelerate progress toward tissue replacement. Here, we have compiled reference GRNs underlying pancreas development from data mining that integrates multiple approaches, including mutant analysis, lineage tracing, cell purification, gene expression and enhancer analysis, and biochemical studies of gene regulation. Using established computational tools, we integrated and represented these networks in frameworks that should enhance understanding of the surging output of genomic-scale genetic and epigenetic studies of pancreas development and diseases such as diabetes and pancreatic cancer. We envision similar approaches would be useful for understanding the development of other organs.

  14. Research of Gene Regulatory Network with Multi-Time Delay Based on Bayesian Network

    Institute of Scientific and Technical Information of China (English)

    LIU Bei; MENG Fanjiang; LI Yong; LIU Liyan

    2008-01-01

    The gene regulatory network was reconstructed according to time-series microarray data getting from hybridization at different time between gene chips to analyze coordination and restriction between genes. An algorithm for controlling the gene expression regulatory network of the whole cell was designed using Bayesian network which provides an effective aided analysis for gene regulatory network.

  15. Inferring gene regression networks with model trees

    Directory of Open Access Journals (Sweden)

    Aguilar-Ruiz Jesus S

    2010-10-01

    Full Text Available Abstract Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear

  16. Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes.

    Science.gov (United States)

    Wuttke, Daniel; Connor, Richard; Vora, Chintan; Craig, Thomas; Li, Yang; Wood, Shona; Vasieva, Olga; Shmookler Reis, Robert; Tang, Fusheng; de Magalhães, João Pedro

    2012-01-01

    Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR-essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR-essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR-essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR-essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR-induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led

  17. Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes.

    Directory of Open Access Journals (Sweden)

    Daniel Wuttke

    Full Text Available Dietary restriction (DR, limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR-essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/. To dissect the interactions of DR-essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR-essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR-essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2 had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR-induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of

  18. Mutational robustness of gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor-target gene interactions but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive. In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.

  19. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  20. Transcriptional delay stabilizes bistable gene networks

    Science.gov (United States)

    Gupta, Chinmaya; López, José Manuel; Ott, William; Josić, Krešimir; Bennett, Matthew R.

    2014-01-01

    Transcriptional delay can significantly impact the dynamics of gene networks. Here we examine how such delay affects bistable systems. We investigate several stochastic models of bistable gene networks and find that increasing delay dramatically increases the mean residence times near stable states. To explain this, we introduce a non-Markovian, analytically tractable reduced model. The model shows that stabilization is the consequence of an increased number of failed transitions between stable states. Each of the bistable systems that we simulate behaves in this manner. PMID:23952450

  1. Ancient DNA

    DEFF Research Database (Denmark)

    Willerslev, Eske; Cooper, Alan

    2004-01-01

    ancient DNA, palaeontology, palaeoecology, archaeology, population genetics, DNA damage and repair......ancient DNA, palaeontology, palaeoecology, archaeology, population genetics, DNA damage and repair...

  2. Compressed Adjacency Matrices: Untangling Gene Regulatory Networks.

    Science.gov (United States)

    Dinkla, K; Westenberg, M A; van Wijk, J J

    2012-12-01

    We present a novel technique-Compressed Adjacency Matrices-for visualizing gene regulatory networks. These directed networks have strong structural characteristics: out-degrees with a scale-free distribution, in-degrees bound by a low maximum, and few and small cycles. Standard visualization techniques, such as node-link diagrams and adjacency matrices, are impeded by these network characteristics. The scale-free distribution of out-degrees causes a high number of intersecting edges in node-link diagrams. Adjacency matrices become space-inefficient due to the low in-degrees and the resulting sparse network. Compressed adjacency matrices, however, exploit these structural characteristics. By cutting open and rearranging an adjacency matrix, we achieve a compact and neatly-arranged visualization. Compressed adjacency matrices allow for easy detection of subnetworks with a specific structure, so-called motifs, which provide important knowledge about gene regulatory networks to domain experts. We summarize motifs commonly referred to in the literature, and relate them to network analysis tasks common to the visualization domain. We show that a user can easily find the important motifs in compressed adjacency matrices, and that this is hard in standard adjacency matrix and node-link diagrams. We also demonstrate that interaction techniques for standard adjacency matrices can be used for our compressed variant. These techniques include rearrangement clustering, highlighting, and filtering.

  3. Glucocorticoid receptor-dependent gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Phillip Phuc Le

    2005-08-01

    Full Text Available While the molecular mechanisms of glucocorticoid regulation of transcription have been studied in detail, the global networks regulated by the glucocorticoid receptor (GR remain unknown. To address this question, we performed an orthogonal analysis to identify direct targets of the GR. First, we analyzed the expression profile of mouse livers in the presence or absence of exogenous glucocorticoid, resulting in over 1,300 differentially expressed genes. We then executed genome-wide location analysis on chromatin from the same livers, identifying more than 300 promoters that are bound by the GR. Intersecting the two lists yielded 53 genes whose expression is functionally dependent upon the ligand-bound GR. Further network and sequence analysis of the functional targets enabled us to suggest interactions between the GR and other transcription factors at specific target genes. Together, our results further our understanding of the GR and its targets, and provide the basis for more targeted glucocorticoid therapies.

  4. Mutated Genes in Schizophrenia Map to Brain Networks

    Science.gov (United States)

    ... Matters NIH Research Matters August 12, 2013 Mutated Genes in Schizophrenia Map to Brain Networks Schizophrenia networks in the ... in People with Serious Mental Illness Clues for Schizophrenia in Rare Gene Glitch Recognizing Schizophrenia: Seeking Clues to a Difficult ...

  5. Biological Consequences of Ancient Gene Acquisition and Duplication in the Large Genome of Candidatus Solibacter usitatus Ellin6076

    Energy Technology Data Exchange (ETDEWEB)

    Challacombe, Jean F [ORNL; Eichorst, Stephanie A [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Land, Miriam L [ORNL; Xie, Gary [Los Alamos National Laboratory (LANL); Kuske, Cheryl R [Los Alamos National Laboratory (LANL)

    2011-01-01

    Members of the bacterial phylum Acidobacteria are widespread in soils and sediments worldwide, and are abundant in many soils. Acidobacteria are challenging to culture in vitro, and many basic features of their biology and functional roles in the soil have not been determined. Candidatus Solibacter usitatus strain Ellin6076 has a 9.9 Mb genome that is approximately 2 5 times as large as the other sequenced Acidobacteria genomes. Bacterial genome sizes typically range from 0.5 to 10 Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Our comparative genome analyses indicate that the Ellin6076 large genome has arisen by horizontal gene transfer via ancient bacteriophage and/or plasmid-mediated transduction, and widespread small-scale gene duplications, resulting in an increased number of paralogs. Low amino acid sequence identities among functional group members, and lack of conserved gene order and orientation in regions containing similar groups of paralogs, suggest that most of the paralogs are not the result of recent duplication events. The genome sizes of additional cultured Acidobacteria strains were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 3 had larger genomes than those of subdivision 1, but none were as large as the Ellin6076 genome. The large genome of Ellin6076 may not be typical of the phylum, and encodes traits that could provide a selective metabolic, defensive and regulatory advantage in the soil environment.

  6. Biological consequences of ancient gene acquisition and duplication in the large genome soil bacterium, ""solibacter usitatus"" strain Ellin6076

    Energy Technology Data Exchange (ETDEWEB)

    Challacombe, Jean F [Los Alamos National Laboratory; Eichorst, Stephanie A [Los Alamos National Laboratory; Xie, Gary [Los Alamos National Laboratory; Kuske, Cheryl R [Los Alamos National Laboratory; Hauser, Loren [ORNL; Land, Miriam [ORNL

    2009-01-01

    Bacterial genome sizes range from ca. 0.5 to 10Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Sequenced genomes of strains in the phylum Acidobacteria revealed that 'Solibacter usistatus' strain Ellin6076 harbors a 9.9 Mb genome. This large genome appears to have arisen by horizontal gene transfer via ancient bacteriophage and plasmid-mediated transduction, as well as widespread small-scale gene duplications. This has resulted in an increased number of paralogs that are potentially ecologically important (ecoparalogs). Low amino acid sequence identities among functional group members and lack of conserved gene order and orientation in the regions containing similar groups of paralogs suggest that most of the paralogs were not the result of recent duplication events. The genome sizes of cultured subdivision 1 and 3 strains in the phylum Acidobacteria were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 1 were estimated to have smaller genome sizes ranging from ca. 2.0 to 4.8 Mb, whereas members of subdivision 3 had slightly larger genomes, from ca. 5.8 to 9.9 Mb. It is hypothesized that the large genome of strain Ellin6076 encodes traits that provide a selective metabolic, defensive and regulatory advantage in the variable soil environment.

  7. Engineering stability in gene networks by autoregulation

    Science.gov (United States)

    Becskei, Attila; Serrano, Luis

    2000-06-01

    The genetic and biochemical networks which underlie such things as homeostasis in metabolism and the developmental programs of living cells, must withstand considerable variations and random perturbations of biochemical parameters. These occur as transient changes in, for example, transcription, translation, and RNA and protein degradation. The intensity and duration of these perturbations differ between cells in a population. The unique state of cells, and thus the diversity in a population, is owing to the different environmental stimuli the individual cells experience and the inherent stochastic nature of biochemical processes (for example, refs 5 and 6). It has been proposed, but not demonstrated, that autoregulatory, negative feedback loops in gene circuits provide stability, thereby limiting the range over which the concentrations of network components fluctuate. Here we have designed and constructed simple gene circuits consisting of a regulator and transcriptional repressor modules in Escherichia coli and we show the gain of stability produced by negative feedback.

  8. Discovering Study-Specific Gene Regulatory Networks

    OpenAIRE

    2014-01-01

    This article has been made available through the Brunel Open Access Publishing Fund. This article has been made available through the Brunel Open Access Publishing Fund. Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus appro...

  9. Disease Gene Prioritization Using Network and Feature

    Science.gov (United States)

    Agam, Gady; Balasubramanian, Sandhya; Xu, Jinbo; Gilliam, T. Conrad; Maltsev, Natalia; Börnigen, Daniela

    2015-01-01

    Abstract Identifying high-confidence candidate genes that are causative for disease phenotypes, from the large lists of variations produced by high-throughput genomics, can be both time-consuming and costly. The development of novel computational approaches, utilizing existing biological knowledge for the prioritization of such candidate genes, can improve the efficiency and accuracy of the biomedical data analysis. It can also reduce the cost of such studies by avoiding experimental validations of irrelevant candidates. In this study, we address this challenge by proposing a novel gene prioritization approach that ranks promising candidate genes that are likely to be involved in a disease or phenotype under study. This algorithm is based on the modified conditional random field (CRF) model that simultaneously makes use of both gene annotations and gene interactions, while preserving their original representation. We validated our approach on two independent disease benchmark studies by ranking candidate genes using network and feature information. Our results showed both high area under the curve (AUC) value (0.86), and more importantly high partial AUC (pAUC) value (0.1296), and revealed higher accuracy and precision at the top predictions as compared with other well-performed gene prioritization tools, such as Endeavour (AUC-0.82, pAUC-0.083) and PINTA (AUC-0.76, pAUC-0.066). We were able to detect more target genes (9/18/19/27) on top positions (1/5/10/20) compared to Endeavour (3/11/14/23) and PINTA (6/10/13/18). To demonstrate its usability, we applied our method to a case study for the prediction of molecular mechanisms contributing to intellectual disability and autism. Our approach was able to correctly recover genes related to both disorders and provide suggestions for possible additional candidates based on their rankings and functional annotations. PMID:25844670

  10. Ancient DNA reveals prehistoric gene-flow from siberia in the complex human population history of North East Europe.

    Science.gov (United States)

    Der Sarkissian, Clio; Balanovsky, Oleg; Brandt, Guido; Khartanovich, Valery; Buzhilova, Alexandra; Koshel, Sergey; Zaporozhchenko, Valery; Gronenborn, Detlef; Moiseyev, Vyacheslav; Kolpakov, Eugen; Shumkin, Vladimir; Alt, Kurt W; Balanovska, Elena; Cooper, Alan; Haak, Wolfgang

    2013-01-01

    North East Europe harbors a high diversity of cultures and languages, suggesting a complex genetic history. Archaeological, anthropological, and genetic research has revealed a series of influences from Western and Eastern Eurasia in the past. While genetic data from modern-day populations is commonly used to make inferences about their origins and past migrations, ancient DNA provides a powerful test of such hypotheses by giving a snapshot of the past genetic diversity. In order to better understand the dynamics that have shaped the gene pool of North East Europeans, we generated and analyzed 34 mitochondrial genotypes from the skeletal remains of three archaeological sites in northwest Russia. These sites were dated to the Mesolithic and the Early Metal Age (7,500 and 3,500 uncalibrated years Before Present). We applied a suite of population genetic analyses (principal component analysis, genetic distance mapping, haplotype sharing analyses) and compared past demographic models through coalescent simulations using Bayesian Serial SimCoal and Approximate Bayesian Computation. Comparisons of genetic data from ancient and modern-day populations revealed significant changes in the mitochondrial makeup of North East Europeans through time. Mesolithic foragers showed high frequencies and diversity of haplogroups U (U2e, U4, U5a), a pattern observed previously in European hunter-gatherers from Iberia to Scandinavia. In contrast, the presence of mitochondrial DNA haplogroups C, D, and Z in Early Metal Age individuals suggested discontinuity with Mesolithic hunter-gatherers and genetic influx from central/eastern Siberia. We identified remarkable genetic dissimilarities between prehistoric and modern-day North East Europeans/Saami, which suggests an important role of post-Mesolithic migrations from Western Europe and subsequent population replacement/extinctions. This work demonstrates how ancient DNA can improve our understanding of human population movements across

  11. Ancient DNA reveals prehistoric gene-flow from siberia in the complex human population history of North East Europe.

    Directory of Open Access Journals (Sweden)

    Clio Der Sarkissian

    Full Text Available North East Europe harbors a high diversity of cultures and languages, suggesting a complex genetic history. Archaeological, anthropological, and genetic research has revealed a series of influences from Western and Eastern Eurasia in the past. While genetic data from modern-day populations is commonly used to make inferences about their origins and past migrations, ancient DNA provides a powerful test of such hypotheses by giving a snapshot of the past genetic diversity. In order to better understand the dynamics that have shaped the gene pool of North East Europeans, we generated and analyzed 34 mitochondrial genotypes from the skeletal remains of three archaeological sites in northwest Russia. These sites were dated to the Mesolithic and the Early Metal Age (7,500 and 3,500 uncalibrated years Before Present. We applied a suite of population genetic analyses (principal component analysis, genetic distance mapping, haplotype sharing analyses and compared past demographic models through coalescent simulations using Bayesian Serial SimCoal and Approximate Bayesian Computation. Comparisons of genetic data from ancient and modern-day populations revealed significant changes in the mitochondrial makeup of North East Europeans through time. Mesolithic foragers showed high frequencies and diversity of haplogroups U (U2e, U4, U5a, a pattern observed previously in European hunter-gatherers from Iberia to Scandinavia. In contrast, the presence of mitochondrial DNA haplogroups C, D, and Z in Early Metal Age individuals suggested discontinuity with Mesolithic hunter-gatherers and genetic influx from central/eastern Siberia. We identified remarkable genetic dissimilarities between prehistoric and modern-day North East Europeans/Saami, which suggests an important role of post-Mesolithic migrations from Western Europe and subsequent population replacement/extinctions. This work demonstrates how ancient DNA can improve our understanding of human population

  12. Population Structure of UK Biobank and Ancient Eurasians Reveals Adaptation at Genes Influencing Blood Pressure.

    Science.gov (United States)

    Galinsky, Kevin J; Loh, Po-Ru; Mallick, Swapan; Patterson, Nick J; Price, Alkes L

    2016-11-03

    Analyzing genetic differences between closely related populations can be a powerful way to detect recent adaptation. The very large sample size of the UK Biobank is ideal for using population differentiation to detect selection and enables an analysis of the UK population structure at fine resolution. In this study, analyses of 113,851 UK Biobank samples showed that population structure in the UK is dominated by five principal components (PCs) spanning six clusters: Northern Ireland, Scotland, northern England, southern England, and two Welsh clusters. Analyses of ancient Eurasians revealed that populations in the northern UK have higher levels of Steppe ancestry and that UK population structure cannot be explained as a simple mixture of Celts and Saxons. A scan for unusual population differentiation along the top PCs identified a genome-wide-significant signal of selection at the coding variant rs601338 in FUT2 (p = 9.16 × 10(-9)). In addition, by combining evidence of unusual differentiation within the UK with evidence from ancient Eurasians, we identified genome-wide-significant (p = 5 × 10(-8)) signals of recent selection at two additional loci: CYP1A2-CSK and F12. We detected strong associations between diastolic blood pressure in the UK Biobank and both the variants with selection signals at CYP1A2-CSK (p = 1.10 × 10(-19)) and the variants with ancient Eurasian selection signals at the ATXN2-SH2B3 locus (p = 8.00 × 10(-33)), implicating recent adaptation related to blood pressure.

  13. Improving gene regulatory network inference using network topology information.

    Science.gov (United States)

    Nair, Ajay; Chetty, Madhu; Wangikar, Pramod P

    2015-09-01

    Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.

  14. A network view on Schizophrenia related genes

    Directory of Open Access Journals (Sweden)

    Sreedevi Chandrasekaran

    2012-03-01

    Full Text Available This study is a part of a project investigating the molecular determinants of neurological diseases. To account for the systemic nature of these diseases we proceeded from a well established list of 38 schizophrenia-related genes (Allen et al., 2008; Ross et al., 2006 and investigated their closest network environment. The created networks were compared to recently proposed list of 173 schizophrenia related genes (Sun et al., 2009. 115 genes were predicted as potentially related to schizophrenia and subjected to GSEA. The enriched groups of proteins included neuromodulators, neurotransmitters and lipid transport. Over 100 signaling pathways were found significantly involved, signal transduction emerging as the most highly significant biological process. Next, we analyzed two microarray expression datasets derived from olfactory mucosa biopsies of schizophrenic patients and postmortem brain tissue samples from SMRIDB. The systems biology analysis resulted in a number of other genes predicted to be potentially related to schizophrenia, as well as in additional information of interest for elucidating molecular mechanisms of schizophrenia.

  15. Random matrix analysis for gene interaction networks in cancer cells

    CERN Document Server

    Kikkawa, Ayumi

    2016-01-01

    Motivation: The investigation of topological modifications of the gene interaction networks in cancer cells is essential for understanding the desease. We study gene interaction networks in various human cancer cells with the random matrix theory. This study is based on the Cancer Network Galaxy (TCNG) database which is the repository of huge gene interactions inferred by Bayesian network algorithms from 256 microarray experimental data downloaded from NCBI GEO. The original GEO data are provided by the high-throughput microarray expression experiments on various human cancer cells. We apply the random matrix theory to the computationally inferred gene interaction networks in TCNG in order to detect the universality in the topology of the gene interaction networks in cancer cells. Results: We found the universal behavior in almost one half of the 256 gene interaction networks in TCNG. The distribution of nearest neighbor level spacing of the gene interaction matrix becomes the Wigner distribution when the net...

  16. Toward a new history and geography of human genes informed by ancient DNA.

    Science.gov (United States)

    Pickrell, Joseph K; Reich, David

    2014-09-01

    Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture, and population replacement subsequent to the initial out-of-Africa expansion have altered the genetic structure of most of the world's human populations. In light of this we argue that it is time to critically reevaluate current models of the peopling of the globe, as well as the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.

  17. Chaotic motifs in gene regulatory networks.

    Science.gov (United States)

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs.

  18. Inferring slowly-changing dynamic gene-regulatory networks

    NARCIS (Netherlands)

    Wit, Ernst C.; Abbruzzo, Antonino

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a cla

  19. A contribution to the study of plant development evolution based on gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Francisco J. Romero-Campero

    2013-08-01

    Full Text Available Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms.

  20. Modular composition of gene transcription networks.

    Directory of Open Access Journals (Sweden)

    Andras Gyorgy

    2014-03-01

    Full Text Available Predicting the dynamic behavior of a large network from that of the composing modules is a central problem in systems and synthetic biology. Yet, this predictive ability is still largely missing because modules display context-dependent behavior. One cause of context-dependence is retroactivity, a phenomenon similar to loading that influences in non-trivial ways the dynamic performance of a module upon connection to other modules. Here, we establish an analysis framework for gene transcription networks that explicitly accounts for retroactivity. Specifically, a module's key properties are encoded by three retroactivity matrices: internal, scaling, and mixing retroactivity. All of them have a physical interpretation and can be computed from macroscopic parameters (dissociation constants and promoter concentrations and from the modules' topology. The internal retroactivity quantifies the effect of intramodular connections on an isolated module's dynamics. The scaling and mixing retroactivity establish how intermodular connections change the dynamics of connected modules. Based on these matrices and on the dynamics of modules in isolation, we can accurately predict how loading will affect the behavior of an arbitrary interconnection of modules. We illustrate implications of internal, scaling, and mixing retroactivity on the performance of recurrent network motifs, including negative autoregulation, combinatorial regulation, two-gene clocks, the toggle switch, and the single-input motif. We further provide a quantitative metric that determines how robust the dynamic behavior of a module is to interconnection with other modules. This metric can be employed both to evaluate the extent of modularity of natural networks and to establish concrete design guidelines to minimize retroactivity between modules in synthetic systems.

  1. Modular composition of gene transcription networks.

    Science.gov (United States)

    Gyorgy, Andras; Del Vecchio, Domitilla

    2014-03-01

    Predicting the dynamic behavior of a large network from that of the composing modules is a central problem in systems and synthetic biology. Yet, this predictive ability is still largely missing because modules display context-dependent behavior. One cause of context-dependence is retroactivity, a phenomenon similar to loading that influences in non-trivial ways the dynamic performance of a module upon connection to other modules. Here, we establish an analysis framework for gene transcription networks that explicitly accounts for retroactivity. Specifically, a module's key properties are encoded by three retroactivity matrices: internal, scaling, and mixing retroactivity. All of them have a physical interpretation and can be computed from macroscopic parameters (dissociation constants and promoter concentrations) and from the modules' topology. The internal retroactivity quantifies the effect of intramodular connections on an isolated module's dynamics. The scaling and mixing retroactivity establish how intermodular connections change the dynamics of connected modules. Based on these matrices and on the dynamics of modules in isolation, we can accurately predict how loading will affect the behavior of an arbitrary interconnection of modules. We illustrate implications of internal, scaling, and mixing retroactivity on the performance of recurrent network motifs, including negative autoregulation, combinatorial regulation, two-gene clocks, the toggle switch, and the single-input motif. We further provide a quantitative metric that determines how robust the dynamic behavior of a module is to interconnection with other modules. This metric can be employed both to evaluate the extent of modularity of natural networks and to establish concrete design guidelines to minimize retroactivity between modules in synthetic systems.

  2. Phylogenomic analyses resolve an ancient trichotomy at the base of Ischyropsalidoidea (Arachnida, Opiliones) despite high levels of gene tree conflict and unequal minority resolution frequencies.

    Science.gov (United States)

    Richart, Casey H; Hayashi, Cheryl Y; Hedin, Marshal

    2016-02-01

    Phylogenetic resolution of ancient rapid radiations has remained problematic despite major advances in statistical approaches and DNA sequencing technologies. Here we report on a combined phylogenetic approach utilizing transcriptome data in conjunction with Sanger sequence data to investigate a tandem of ancient divergences in the harvestmen superfamily Ischyropsalidoidea (Arachnida, Opiliones, Dyspnoi). We rely on Sanger sequences to resolve nodes within and between closely related genera, and use RNA-seq data from a subset of taxa to resolve a short and ancient internal branch. We use several analytical approaches to explore this succession of ancient diversification events, including concatenated and coalescent-based analyses and maximum likelihood gene trees for each locus. We evaluate the robustness of phylogenetic inferences using a randomized locus sub-sampling approach, and find congruence across these methods despite considerable incongruence across gene trees. Incongruent gene trees are not recovered in frequencies expected from a simple multispecies coalescent model, and we reject incomplete lineage sorting as the sole contributor to gene tree conflict. Using these approaches we attain robust support for higher-level phylogenetic relationships within Ischyropsalidoidea.

  3. Evolution of red algal plastid genomes: ancient architectures, introns, horizontal gene transfer, and taxonomic utility of plastid markers.

    Directory of Open Access Journals (Sweden)

    Jan Janouškovec

    Full Text Available Red algae have the most gene-rich plastid genomes known, but despite their evolutionary importance these genomes remain poorly sampled. Here we characterize three complete and one partial plastid genome from a diverse range of florideophytes. By unifying annotations across all available red algal plastid genomes we show they all share a highly compact and slowly-evolving architecture and uniquely rich gene complements. Both chromosome structure and gene content have changed very little during red algal diversification, and suggest that plastid-to nucleus gene transfers have been rare. Despite their ancient character, however, the red algal plastids also contain several unprecedented features, including a group II intron in a tRNA-Met gene that encodes the first example of red algal plastid intron maturase - a feature uniquely shared among florideophytes. We also identify a rare case of a horizontally-acquired proteobacterial operon, and propose this operon may have been recruited for plastid function and potentially replaced a nucleus-encoded plastid-targeted paralogue. Plastid genome phylogenies yield a fully resolved tree and suggest that plastid DNA is a useful tool for resolving red algal relationships. Lastly, we estimate the evolutionary rates among more than 200 plastid genes, and assess their usefulness for species and subspecies taxonomy by comparison to well-established barcoding markers such as cox1 and rbcL. Overall, these data demonstrates that red algal plastid genomes are easily obtainable using high-throughput sequencing of total genomic DNA, interesting from evolutionary perspectives, and promising in resolving red algal relationships at evolutionarily-deep and species/subspecies levels.

  4. Synthetic gene networks in plant systems.

    Science.gov (United States)

    Junker, Astrid; Junker, Björn H

    2012-01-01

    Synthetic biology methods are routinely applied in the plant field as in other eukaryotic model systems. Several synthetic components have been developed in plants and an increasing number of studies report on the assembly into functional synthetic genetic circuits. This chapter gives an overview of the existing plant genetic networks and describes in detail the application of two systems for inducible gene expression. The ethanol-inducible system relies on the ethanol-responsive interaction of the AlcA transcriptional activator and the AlcR receptor resulting in the transcription of the gene of interest (GOI). In comparison, the translational fusion of GOI and the glucocorticoid receptor (GR) domain leads to the dexamethasone-dependent nuclear translocation of the GOI::GR protein. This chapter contains detailed protocols for the application of both systems in the model plants potato and Arabidopsis, respectively.

  5. Cell cycle-dependent gene networks relevant to cancer

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The analysis of sophisticated interplays between cell cycle-dependent genes in a disease condition is one of the largely unexplored areas in modern tumor biology research. Many cell cycle-dependent genes are either oncogenes or suppressor genes, or are closely asso- ciated with the transition of a cell cycle. However, it is unclear how the complicated relationships between these cell cycle-dependent genes are, especially in cancers. Here, we sought to identify significant expression relationships between cell cycle-dependent genes by analyzing a HeLa microarray dataset using a local alignment algorithm and constructed a gene transcriptional network specific to the cancer by assembling these newly identified gene-gene relationships. We further characterized this global network by partitioning the whole network into several cell cycle phase-specific sub-networks. All generated networks exhibited the power-law node-degree dis- tribution, and the average clustering coefficients of these networks were remarkably higher than those of pure scale-free networks, indi- cating a property of hierarchical modularity. Based on the known protein-protein interactions and Gene Ontology annotation data, the proteins encoded by cell cycle-dependent interacting genes tended to share the same biological functions or to be involved in the same biological processes, rather than interacting by physical means. Finally, we identified the hub genes related to cancer based on the topo- logical importance that maintain the basic structure of cell cycle-dependent gene networks.

  6. Ancient signals: comparative genomics of plant MAPK and MAPKK gene families

    DEFF Research Database (Denmark)

    Hamel, Louis-Philippe; Nicole, Marie-Claude; Sritubtim, Somrudee;

    2006-01-01

    MAPK signal transduction modules play crucial roles in regulating many biological processes in plants, and their components are encoded by highly conserved genes. The recent availability of genome sequences for rice and poplar now makes it possible to examine how well the previously described...... Arabidopsis MAPK and MAPKK gene family structures represent the broader evolutionary situation in plants, and analysis of gene expression data for MPK and MKK genes in all three species allows further refinement of those families, based on functionality. The Arabidopsis MAPK nomenclature appears sufficiently...... robust to allow it to be usefully extended to other well-characterized plant systems....

  7. Gene Networks Underlying Chronic Sleep Deprivation in Drosophila

    Science.gov (United States)

    2014-06-15

    SECURITY CLASSIFICATION OF: Studies of the gene network affected by sleep deprivation and stress in the fruit fly Drosophila have revealed the...15-Apr-2009 14-Apr-2013 Approved for Public Release; Distribution Unlimited Gene Networks Underlying Chronic Sleep Deprivation in Drosophila The...Chronic Sleep Deprivation in Drosophila Report Title Studies of the gene network affected by sleep deprivation and stress in the fruit fly Drosophila have

  8. A molecular phylogeny of bivalve mollusks: ancient radiations and divergences as revealed by mitochondrial genes.

    Directory of Open Access Journals (Sweden)

    Federico Plazzi

    Full Text Available BACKGROUND: Bivalves are very ancient and successful conchiferan mollusks (both in terms of species number and geographical distribution. Despite their importance in marine biota, their deep phylogenetic relationships were scarcely investigated from a molecular perspective, whereas much valuable work has been done on taxonomy, as well as phylogeny, of lower taxa. METHODOLOGY/PRINCIPAL FINDINGS: Here we present a class-level bivalve phylogeny with a broad sample of 122 ingroup taxa, using four mitochondrial markers (MT-RNR1, MT-RNR2, MT-CO1, MT-CYB. Rigorous techniques have been exploited to set up the dataset, analyze phylogenetic signal, and infer a single final tree. In this study, we show the basal position of Opponobranchia to all Autobranchia, as well as of Palaeoheterodonta to the remaining Autobranchia, which we here propose to call Amarsipobranchia. Anomalodesmata were retrieved as monophyletic and basal to (Heterodonta + Pteriomorphia. CONCLUSIONS/SIGNIFICANCE: Bivalve morphological characters were traced onto the phylogenetic trees obtained from the molecular analysis; our analysis suggests that eulamellibranch gills and heterodont hinge are ancestral characters for all Autobranchia. This conclusion would entail a re-evaluation of bivalve symplesiomorphies.

  9. Mosaic genome of endobacteria in arbuscular mycorrhizal fungi: Transkingdom gene transfer in an ancient mycoplasma-fungus association.

    Science.gov (United States)

    Torres-Cortés, Gloria; Ghignone, Stefano; Bonfante, Paola; Schüßler, Arthur

    2015-06-23

    For more than 450 million years, arbuscular mycorrhizal fungi (AMF) have formed intimate, mutualistic symbioses with the vast majority of land plants and are major drivers in almost all terrestrial ecosystems. The obligate plant-symbiotic AMF host additional symbionts, so-called Mollicutes-related endobacteria (MRE). To uncover putative functional roles of these widespread but yet enigmatic MRE, we sequenced the genome of DhMRE living in the AMF Dentiscutata heterogama. Multilocus phylogenetic analyses showed that MRE form a previously unidentified lineage sister to the hominis group of Mycoplasma species. DhMRE possesses a strongly reduced metabolic capacity with 55% of the proteins having unknown function, which reflects unique adaptations to an intracellular lifestyle. We found evidence for transkingdom gene transfer between MRE and their AMF host. At least 27 annotated DhMRE proteins show similarities to nuclear-encoded proteins of the AMF Rhizophagus irregularis, which itself lacks MRE. Nuclear-encoded homologs could moreover be identified for another AMF, Gigaspora margarita, and surprisingly, also the non-AMF Mortierella verticillata. Our data indicate a possible origin of the MRE-fungus association in ancestors of the Glomeromycota and Mucoromycotina. The DhMRE genome encodes an arsenal of putative regulatory proteins with eukaryotic-like domains, some of them encoded in putative genomic islands. MRE are highly interesting candidates to study the evolution and interactions between an ancient, obligate endosymbiotic prokaryote with its obligate plant-symbiotic fungal host. Our data moreover may be used for further targeted searches for ancient effector-like proteins that may be key components in the regulation of the arbuscular mycorrhiza symbiosis.

  10. Cancer classification based on gene expression using neural networks.

    Science.gov (United States)

    Hu, H P; Niu, Z J; Bai, Y P; Tan, X H

    2015-12-21

    Based on gene expression, we have classified 53 colon cancer patients with UICC II into two groups: relapse and no relapse. Samples were taken from each patient, and gene information was extracted. Of the 53 samples examined, 500 genes were considered proper through analyses by S-Kohonen, BP, and SVM neural networks. Classification accuracy obtained by S-Kohonen neural network reaches 91%, which was more accurate than classification by BP and SVM neural networks. The results show that S-Kohonen neural network is more plausible for classification and has a certain feasibility and validity as compared with BP and SVM neural networks.

  11. Overview of methods of reverse engineering of gene regulatory networks: Boolean and Bayesian networks

    OpenAIRE

    Frolova A. O.

    2012-01-01

    Reverse engineering of gene regulatory networks is an intensively studied topic in Systems Biology as it reconstructs regulatory interactions between all genes in the genome in the most complete form. The extreme computational complexity of this problem and lack of thorough reviews on reconstruction methods of gene regulatory network is a significant obstacle to further development of this area. In this article the two most common methods for modeling gene regulatory networks are surveyed: Bo...

  12. The tricarboxylic acid cycle, an ancient metabolic network with a novel twist.

    Directory of Open Access Journals (Sweden)

    Ryan J Mailloux

    Full Text Available The tricarboxylic acid (TCA cycle is an essential metabolic network in all oxidative organisms and provides precursors for anabolic processes and reducing factors (NADH and FADH(2 that drive the generation of energy. Here, we show that this metabolic network is also an integral part of the oxidative defence machinery in living organisms and alpha-ketoglutarate (KG is a key participant in the detoxification of reactive oxygen species (ROS. Its utilization as an anti-oxidant can effectively diminish ROS and curtail the formation of NADH, a situation that further impedes the release of ROS via oxidative phosphorylation. Thus, the increased production of KG mediated by NADP-dependent isocitrate dehydrogenase (NADP-ICDH and its decreased utilization via the TCA cycle confer a unique strategy to modulate the cellular redox environment. Activities of alpha-ketoglutarate dehydrogenase (KGDH, NAD-dependent isocitrate dehydrogenase (NAD-ICDH, and succinate dehydrogenase (SDH were sharply diminished in the cellular systems exposed to conditions conducive to oxidative stress. These findings uncover an intricate link between TCA cycle and ROS homeostasis and may help explain the ineffective TCA cycle that characterizes various pathological conditions and ageing.

  13. Immune- and wound-dependent differential gene expression in an ancient insect.

    Science.gov (United States)

    Johnston, Paul R; Rolff, Jens

    2013-01-01

    Two of the main functions of the immune system are to control infections and to contribute to wound closure. Here we present the results of an RNAseq study of immune- and wound-response gene expression in the damselfly Coenagrion puella, a representative of the odonates, the oldest taxon of winged insects. De novo assembly of RNAseq data revealed a rich repertoire of canonical immune pathways, as known from model insects, including recognition, transduction and effector gene expression. A shared set of immune and wound repair genes were differentially expressed in both wounded and immune-challenged larvae. Moreover 3-fold more immune genes were induced only in the immune-challenged treatment. This is consistent with the notion that the immune-system reads a balance of signals related to wounding and infection and that the response is tailored accordingly.

  14. Phylogenomic analysis demonstrates a pattern of rare and ancient horizontal gene transfer between plants and fungi.

    Science.gov (United States)

    Richards, Thomas A; Soanes, Darren M; Foster, Peter G; Leonard, Guy; Thornton, Christopher R; Talbot, Nicholas J

    2009-07-01

    Horizontal gene transfer (HGT) describes the transmission of genetic material across species boundaries and is an important evolutionary phenomenon in the ancestry of many microbes. The role of HGT in plant evolutionary history is, however, largely unexplored. Here, we compare the genomes of six plant species with those of 159 prokaryotic and eukaryotic species and identify 1689 genes that show the highest similarity to corresponding genes from fungi. We constructed a phylogeny for all 1689 genes identified and all homolog groups available from the rice (Oryza sativa) genome (3177 gene families) and used these to define 14 candidate plant-fungi HGT events. Comprehensive phylogenetic analyses of these 14 data sets, using methods that account for site rate heterogeneity, demonstrated support for nine HGT events, demonstrating an infrequent pattern of HGT between plants and fungi. Five HGTs were fungi-to-plant transfers and four were plant-to-fungi HGTs. None of the fungal-to-plant HGTs involved angiosperm recipients. These results alter the current view of organismal barriers to HGT, suggesting that phagotrophy, the consumption of a whole cell by another, is not necessarily a prerequisite for HGT between eukaryotes. Putative functional annotation of the HGT candidate genes suggests that two fungi-to-plant transfers have added phenotypes important for life in a soil environment. Our study suggests that genetic exchange between plants and fungi is exceedingly rare, particularly among the angiosperms, but has occurred during their evolutionary history and added important metabolic traits to plant lineages.

  15. Research on the tourism resource development from the perspective of network capability-Taking Wuxi Huishan Ancient Town as an example

    Science.gov (United States)

    Bao, Yanli; Hua, Hefeng

    2017-03-01

    Network capability is the enterprise's capability to set up, manage, maintain and use a variety of relations between enterprises, and to obtain resources for improving competitiveness. Tourism in China is in a transformation period from sightseeing to leisure and vacation. Scenic spots as well as tourist enterprises can learn from some other enterprises in the process of resource development, and build up its own network relations in order to get resources for their survival and development. Through the effective management of network relations, the performance of resource development will be improved. By analyzing literature on network capability and the case analysis of Wuxi Huishan Ancient Town, the role of network capacity in the tourism resource development is explored and resource development path is built from the perspective of network capability. Finally, the tourism resource development process model based on network capacity is proposed. This model mainly includes setting up network vision, resource identification, resource acquisition, resource utilization and tourism project development. In these steps, network construction, network management and improving network center status are key points.

  16. In silico network topology-based prediction of gene essentiality

    CERN Document Server

    da Silva, Joao Paulo Muller; Mombach, Jose Carlos Merino; Vieira, Renata; da Silva, Jose Guliherme Camargo; Lemke, Ney; Sinigaglia, Marialva

    2007-01-01

    The identification of genes essential for survival is important for the understanding of the minimal requirements for cellular life and for drug design. As experimental studies with the purpose of building a catalog of essential genes for a given organism are time-consuming and laborious, a computational approach which could predict gene essentiality with high accuracy would be of great value. We present here a novel computational approach, called NTPGE (Network Topology-based Prediction of Gene Essentiality), that relies on network topology features of a gene to estimate its essentiality. The first step of NTPGE is to construct the integrated molecular network for a given organism comprising protein physical, metabolic and transcriptional regulation interactions. The second step consists in training a decision tree-based machine learning algorithm on known essential and non-essential genes of the organism of interest, considering as learning attributes the network topology information for each of these genes...

  17. Inference of gene pathways using mixture Bayesian networks

    Directory of Open Access Journals (Sweden)

    Ko Younhee

    2009-05-01

    Full Text Available Abstract Background Inference of gene networks typically relies on measurements across a wide range of conditions or treatments. Although one network structure is predicted, the relationship between genes could vary across conditions. A comprehensive approach to infer general and condition-dependent gene networks was evaluated. This approach integrated Bayesian network and Gaussian mixture models to describe continuous microarray gene expression measurements, and three gene networks were predicted. Results The first reconstructions of a circadian rhythm pathway in honey bees and an adherens junction pathway in mouse embryos were obtained. In addition, general and condition-specific gene relationships, some unexpected, were detected in these two pathways and in a yeast cell-cycle pathway. The mixture Bayesian network approach identified all (honey bee circadian rhythm and mouse adherens junction pathways or the vast majority (yeast cell-cycle pathway of the gene relationships reported in empirical studies. Findings across the three pathways and data sets indicate that the mixture Bayesian network approach is well-suited to infer gene pathways based on microarray data. Furthermore, the interpretation of model estimates provided a broader understanding of the relationships between genes. The mixture models offered a comprehensive description of the relationships among genes in complex biological processes or across a wide range of conditions. The mixture parameter estimates and corresponding odds that the gene network inferred for a sample pertained to each mixture component allowed the uncovering of both general and condition-dependent gene relationships and patterns of expression. Conclusion This study demonstrated the two main benefits of learning gene pathways using mixture Bayesian networks. First, the identification of the optimal number of mixture components supported by the data offered a robust approach to infer gene relationships and

  18. Ancient origin and gene mosaicism of the progenitor of Mycobacterium tuberculosis.

    Directory of Open Access Journals (Sweden)

    M Cristina Gutierrez

    2005-09-01

    Full Text Available The highly successful human pathogen Mycobacterium tuberculosis has an extremely low level of genetic variation, which suggests that the entire population resulted from clonal expansion following an evolutionary bottleneck around 35,000 y ago. Here, we show that this population constitutes just the visible tip of a much broader progenitor species, whose extant representatives are human isolates of tubercle bacilli from East Africa. In these isolates, we detected incongruence among gene phylogenies as well as mosaic gene sequences, whose individual elements are retrieved in classical M. tuberculosis. Therefore, despite its apparent homogeneity, the M. tuberculosis genome appears to be a composite assembly resulting from horizontal gene transfer events predating clonal expansion. The amount of synonymous nucleotide variation in housekeeping genes suggests that tubercle bacilli were contemporaneous with early hominids in East Africa, and have thus been coevolving with their human host much longer than previously thought. These results open novel perspectives for unraveling the molecular bases of M. tuberculosis evolutionary success.

  19. γ-Crystallins of the chicken lens: remnants of an ancient vertebrate gene family in birds.

    Science.gov (United States)

    Chen, Yingwei; Sagar, Vatsala; Len, Hoay-Shuen; Peterson, Katherine; Fan, Jianguo; Mishra, Sanghamitra; McMurtry, John; Wilmarth, Phillip A; David, Larry L; Wistow, Graeme

    2016-04-01

    γ-Crystallins, abundant proteins of vertebrate lenses, were thought to be absent from birds. However, bird genomes contain well-conserved genes for γS- and γN-crystallins. Although expressed sequence tag analysis of chicken eye found no transcripts for these genes, RT-PCR detected spliced transcripts for both genes in chicken lens, with lower levels in cornea and retina/retinal pigment epithelium. The level of mRNA for γS in chicken lens was relatively very low even though the chicken crygs gene promoter had lens-preferred activity similar to that of mouse. Chicken γS was detected by a peptide antibody in lens, but not in other ocular tissues. Low levels of γS and γN proteins were detected in chicken lens by shotgun mass spectroscopy. Water-soluble and water-insoluble lens fractions were analyzed and 1934 proteins (chicken lens proteome 30-fold. Although chicken γS is well conserved in protein sequence, it has one notable difference in leucine 16, replacing a surface glutamine conserved in other γ-crystallins, possibly affecting solubility. However, L16 and engineered Q16 versions were both highly soluble and had indistinguishable circular dichroism, tryptophan fluorescence and heat stability (melting temperature Tm ~ 65 °C) profiles. L16 has been present in birds for over 100 million years and may have been adopted for a specific protein interaction in the bird lens. However, evolution has clearly reduced or eliminated expression of ancestral γ-crystallins in bird lenses. The conservation of genes for γS- and γN-crystallins suggests they may have been preserved for reasons unrelated to the bulk properties of the lens.

  20. Oxytocin Pathway Genes: Evolutionary Ancient System Impacting on Human Affiliation, Sociality, and Psychopathology.

    Science.gov (United States)

    Feldman, Ruth; Monakhov, Mikhail; Pratt, Maayan; Ebstein, Richard P

    2016-02-01

    Oxytocin (OT), a nonapeptide signaling molecule originating from an ancestral peptide, appears in different variants across all vertebrate and several invertebrate species. Throughout animal evolution, neuropeptidergic signaling has been adapted by organisms for regulating response to rapidly changing environments. The family of OT-like molecules affects both peripheral tissues implicated in reproduction, homeostasis, and energy balance, as well as neuromodulation of social behavior, stress regulation, and associative learning in species ranging from nematodes to humans. After describing the OT-signaling pathway, we review research on the three genes most extensively studied in humans: the OT receptor (OXTR), the structural gene for OT (OXT/neurophysin-I), and CD38. Consistent with the notion that sociality should be studied from the perspective of social life at the species level, we address human social functions in relation to OT-pathway genes, including parenting, empathy, and using social relationships to manage stress. We then describe associations between OT-pathway genes with psychopathologies involving social dysfunctions such as autism, depression, or schizophrenia. Human research particularly underscored the involvement of two OXTR single nucleotide polymorphisms (rs53576, rs2254298) with fewer studies focusing on other OXTR (rs7632287, rs1042778, rs2268494, rs2268490), OXT (rs2740210, rs4813627, rs4813625), and CD38 (rs3796863, rs6449197) single nucleotide polymorphisms. Overall, studies provide evidence for the involvement of OT-pathway genes in human social functions but also suggest that factors such as gender, culture, and early environment often confound attempts to replicate first findings. We conclude by discussing epigenetics, conceptual implications within an evolutionary perspective, and future directions, especially the need to refine phenotypes, carefully characterize early environments, and integrate observations of social behavior across

  1. Ancient Egypt.

    Science.gov (United States)

    Evers, Virginia

    This four-week fourth grade social studies unit dealing with religious dimensions in ancient Egyptian culture was developed by the Public Education Religion Studies Center at Wright State University. It seeks to help students understand ancient Egypt by looking at the people, the culture, and the people's world view. The unit begins with outlines…

  2. An ancient dental gene set governs development and continuous regeneration of teeth in sharks.

    Science.gov (United States)

    Rasch, Liam J; Martin, Kyle J; Cooper, Rory L; Metscher, Brian D; Underwood, Charlie J; Fraser, Gareth J

    2016-07-15

    The evolution of oral teeth is considered a major contributor to the overall success of jawed vertebrates. This is especially apparent in cartilaginous fishes including sharks and rays, which develop elaborate arrays of highly specialized teeth, organized in rows and retain the capacity for life-long regeneration. Perpetual regeneration of oral teeth has been either lost or highly reduced in many other lineages including important developmental model species, so cartilaginous fishes are uniquely suited for deep comparative analyses of tooth development and regeneration. Additionally, sharks and rays can offer crucial insights into the characters of the dentition in the ancestor of all jawed vertebrates. Despite this, tooth development and regeneration in chondrichthyans is poorly understood and remains virtually uncharacterized from a developmental genetic standpoint. Using the emerging chondrichthyan model, the catshark (Scyliorhinus spp.), we characterized the expression of genes homologous to those known to be expressed during stages of early dental competence, tooth initiation, morphogenesis, and regeneration in bony vertebrates. We have found that expression patterns of several genes from Hh, Wnt/β-catenin, Bmp and Fgf signalling pathways indicate deep conservation over ~450 million years of tooth development and regeneration. We describe how these genes participate in the initial emergence of the shark dentition and how they are redeployed during regeneration of successive tooth generations. We suggest that at the dawn of the vertebrate lineage, teeth (i) were most likely continuously regenerative structures, and (ii) utilised a core set of genes from members of key developmental signalling pathways that were instrumental in creating a dental legacy redeployed throughout vertebrate evolution. These data lay the foundation for further experimental investigations utilizing the unique regenerative capacity of chondrichthyan models to answer evolutionary

  3. Exhaustive Search for Fuzzy Gene Networks from Microarray Data

    Energy Technology Data Exchange (ETDEWEB)

    Sokhansanj, B A; Fitch, J P; Quong, J N; Quong, A A

    2003-07-07

    Recent technological advances in high-throughput data collection allow for the study of increasingly complex systems on the scale of the whole cellular genome and proteome. Gene network models are required to interpret large and complex data sets. Rationally designed system perturbations (e.g. gene knock-outs, metabolite removal, etc) can be used to iteratively refine hypothetical models, leading to a modeling-experiment cycle for high-throughput biological system analysis. We use fuzzy logic gene network models because they have greater resolution than Boolean logic models and do not require the precise parameter measurement needed for chemical kinetics-based modeling. The fuzzy gene network approach is tested by exhaustive search for network models describing cyclin gene interactions in yeast cell cycle microarray data, with preliminary success in recovering interactions predicted by previous biological knowledge and other analysis techniques. Our goal is to further develop this method in combination with experiments we are performing on bacterial regulatory networks.

  4. An ancient spliceosomal intron in the ribosomal protein L7a gene (Rpl7a of Giardia lamblia

    Directory of Open Access Journals (Sweden)

    Gray Michael W

    2005-08-01

    Full Text Available Abstract Background Only one spliceosomal-type intron has previously been identified in the unicellular eukaryotic parasite, Giardia lamblia (a diplomonad. This intron is only 35 nucleotides in length and is unusual in possessing a non-canonical 5' intron boundary sequence, CT, instead of GT. Results We have identified a second spliceosomal-type intron in G. lamblia, in the ribosomal protein L7a gene (Rpl7a, that possesses a canonical GT 5' intron boundary sequence. A comparison of the two known Giardia intron sequences revealed extensive nucleotide identity at both the 5' and 3' intron boundaries, similar to the conserved sequence motifs recently identified at the boundaries of spliceosomal-type introns in Trichomonas vaginalis (a parabasalid. Based on these observations, we searched the partial G. lamblia genome sequence for these conserved features and identified a third spliceosomal intron, in an unassigned open reading frame. Our comprehensive analysis of the Rpl7a intron in other eukaryotic taxa demonstrates that it is evolutionarily conserved and is an ancient eukaryotic intron. Conclusion An analysis of the phylogenetic distribution and properties of the Rpl7a intron suggests its utility as a phylogenetic marker to evaluate particular eukaryotic groupings. Additionally, analysis of the G. lamblia introns has provided further insight into some of the conserved and unique features possessed by the recently identified spliceosomal introns in related organisms such as T. vaginalis and Carpediemonas membranifera.

  5. Novel primate miRNAs coevolved with ancient target genes in germinal zone-specific expression patterns.

    Science.gov (United States)

    Arcila, Mary L; Betizeau, Marion; Cambronne, Xiaolu A; Guzman, Elmer; Doerflinger, Nathalie; Bouhallier, Frantz; Zhou, Hongjun; Wu, Bian; Rani, Neha; Bassett, Danielle S; Borello, Ugo; Huissoud, Cyril; Goodman, Richard H; Dehay, Colette; Kosik, Kenneth S

    2014-03-19

    Major nonprimate-primate differences in cortico-genesis include the dimensions, precursor lineages, and developmental timing of the germinal zones (GZs). microRNAs (miRNAs) of laser-dissected GZ compartments and cortical plate (CP) from embryonic E80 macaque visual cortex were deep sequenced. The CP and the GZ including ventricular zone (VZ) and outer and inner subcompartments of the outer subventricular zone (OSVZ) in area 17 displayed unique miRNA profiles. miRNAs present in primate, but absent in rodent, contributed disproportionately to the differential expression between GZ subregions. Prominent among the validated targets of these miRNAs were cell-cycle and neurogenesis regulators. Coevolution between the emergent miRNAs and their targets suggested that novel miRNAs became integrated into ancient gene circuitry to exert additional control over proliferation. We conclude that multiple cell-cycle regulatory events contribute to the emergence of primate-specific cortical features, including the OSVZ, generated enlarged supragranular layers, largely responsible for the increased primate cortex computational abilities.

  6. Pathogenic Network Analysis Predicts Candidate Genes for Cervical Cancer

    Directory of Open Access Journals (Sweden)

    Yun-Xia Zhang

    2016-01-01

    Full Text Available Purpose. The objective of our study was to predicate candidate genes in cervical cancer (CC using a network-based strategy and to understand the pathogenic process of CC. Methods. A pathogenic network of CC was extracted based on known pathogenic genes (seed genes and differentially expressed genes (DEGs between CC and normal controls. Subsequently, cluster analysis was performed to identify the subnetworks in the pathogenic network using ClusterONE. Each gene in the pathogenic network was assigned a weight value, and then candidate genes were obtained based on the weight distribution. Eventually, pathway enrichment analysis for candidate genes was performed. Results. In this work, a total of 330 DEGs were identified between CC and normal controls. From the pathogenic network, 2 intensely connected clusters were extracted, and a total of 52 candidate genes were detected under the weight values greater than 0.10. Among these candidate genes, VIM had the highest weight value. Moreover, candidate genes MMP1, CDC45, and CAT were, respectively, enriched in pathway in cancer, cell cycle, and methane metabolism. Conclusion. Candidate pathogenic genes including MMP1, CDC45, CAT, and VIM might be involved in the pathogenesis of CC. We believe that our results can provide theoretical guidelines for future clinical application.

  7. Ancient exaptation of a CORE-SINE retroposon into a highly conserved mammalian neuronal enhancer of the proopiomelanocortin gene.

    Directory of Open Access Journals (Sweden)

    Andrea M Santangelo

    2007-10-01

    Full Text Available The proopiomelanocortin gene (POMC is expressed in the pituitary gland and the ventral hypothalamus of all jawed vertebrates, producing several bioactive peptides that function as peripheral hormones or central neuropeptides, respectively. We have recently determined that mouse and human POMC expression in the hypothalamus is conferred by the action of two 5' distal and unrelated enhancers, nPE1 and nPE2. To investigate the evolutionary origin of the neuronal enhancer nPE2, we searched available vertebrate genome databases and determined that nPE2 is a highly conserved element in placentals, marsupials, and monotremes, whereas it is absent in nonmammalian vertebrates. Following an in silico paleogenomic strategy based on genome-wide searches for paralog sequences, we discovered that opossum and wallaby nPE2 sequences are highly similar to members of the superfamily of CORE-short interspersed nucleotide element (SINE retroposons, in particular to MAR1 retroposons that are widely present in marsupial genomes. Thus, the neuronal enhancer nPE2 originated from the exaptation of a CORE-SINE retroposon in the lineage leading to mammals and remained under purifying selection in all mammalian orders for the last 170 million years. Expression studies performed in transgenic mice showed that two nonadjacent nPE2 subregions are essential to drive reporter gene expression into POMC hypothalamic neurons, providing the first functional example of an exapted enhancer derived from an ancient CORE-SINE retroposon. In addition, we found that this CORE-SINE family of retroposons is likely to still be active in American and Australian marsupial genomes and that several highly conserved exonic, intronic and intergenic sequences in the human genome originated from the exaptation of CORE-SINE retroposons. Together, our results provide clear evidence of the functional novelties that transposed elements contributed to their host genomes throughout evolution.

  8. Noise reduction facilitated by dosage compensation in gene networks

    Science.gov (United States)

    Peng, Weilin; Song, Ruijie; Acar, Murat

    2016-01-01

    Genetic noise together with genome duplication and volume changes during cell cycle are significant contributors to cell-to-cell heterogeneity. How can cells buffer the effects of these unavoidable epigenetic and genetic variations on phenotypes that are sensitive to such variations? Here we show that a simple network motif that is essential for network-dosage compensation can reduce the effects of extrinsic noise on the network output. Using natural and synthetic gene networks with and without the network motif, we measure gene network activity in single yeast cells and find that the activity of the compensated network is significantly lower in noise compared with the non-compensated network. A mathematical analysis provides intuitive insights into these results and a novel stochastic model tracking cell-volume and cell-cycle predicts the experimental results. Our work implies that noise is a selectable trait tunable by evolution. PMID:27694830

  9. Evolutionary responses to a constructed niche: ancient Mesoamericans as a model of gene-culture coevolution.

    Directory of Open Access Journals (Sweden)

    Tábita Hünemeier

    Full Text Available Culture and genetics rely on two distinct but not isolated transmission systems. Cultural processes may change the human selective environment and thereby affect which individuals survive and reproduce. Here, we evaluated whether the modes of subsistence in Native American populations and the frequencies of the ABCA1*Arg230Cys polymorphism were correlated. Further, we examined whether the evolutionary consequences of the agriculturally constructed niche in Mesoamerica could be considered as a gene-culture coevolution model. For this purpose, we genotyped 229 individuals affiliated with 19 Native American populations and added data for 41 other Native American groups (n = 1905 to the analysis. In combination with the SNP cluster of a neutral region, this dataset was then used to unravel the scenario involved in 230Cys evolutionary history. The estimated age of 230Cys is compatible with its origin occurring in the American continent. The correlation of its frequencies with the archeological data on Zea pollen in Mesoamerica/Central America, the neutral coalescent simulations, and the F(ST-based natural selection analysis suggest that maize domestication was the driving force in the increase in the frequencies of 230Cys in this region. These results may represent the first example of a gene-culture coevolution involving an autochthonous American allele.

  10. Evolutionary Responses to a Constructed Niche: Ancient Mesoamericans as a Model of Gene-Culture Coevolution

    Science.gov (United States)

    Hünemeier, Tábita; Amorim, Carlos Eduardo Guerra; Azevedo, Soledad; Contini, Veronica; Acuña-Alonzo, Víctor; Rothhammer, Francisco; Dugoujon, Jean-Michel; Mazières, Stephane; Barrantes, Ramiro; Villarreal-Molina, María Teresa; Paixão-Côrtes, Vanessa Rodrigues; Salzano, Francisco M.; Canizales-Quinteros, Samuel; Ruiz-Linares, Andres; Bortolini, Maria Cátira

    2012-01-01

    Culture and genetics rely on two distinct but not isolated transmission systems. Cultural processes may change the human selective environment and thereby affect which individuals survive and reproduce. Here, we evaluated whether the modes of subsistence in Native American populations and the frequencies of the ABCA1*Arg230Cys polymorphism were correlated. Further, we examined whether the evolutionary consequences of the agriculturally constructed niche in Mesoamerica could be considered as a gene-culture coevolution model. For this purpose, we genotyped 229 individuals affiliated with 19 Native American populations and added data for 41 other Native American groups (n = 1905) to the analysis. In combination with the SNP cluster of a neutral region, this dataset was then used to unravel the scenario involved in 230Cys evolutionary history. The estimated age of 230Cys is compatible with its origin occurring in the American continent. The correlation of its frequencies with the archeological data on Zea pollen in Mesoamerica/Central America, the neutral coalescent simulations, and the FST-based natural selection analysis suggest that maize domestication was the driving force in the increase in the frequencies of 230Cys in this region. These results may represent the first example of a gene-culture coevolution involving an autochthonous American allele. PMID:22768049

  11. Sequencing of Sylvilagus VDJ genes reveals a new VHa allelic lineage and shows that ancient VH lineages were retained differently in leporids.

    Science.gov (United States)

    Pinheiro, Ana; Melo-Ferreira, José; Abrantes, Joana; Martinelli, Nicola; Lavazza, Antonio; Alves, Paulo C; Gortázar, Christian; Esteves, Pedro J

    2014-12-01

    Antigen recognition by immunoglobulins depends upon initial rearrangements of heavy chain V, D, and J genes. In leporids, a unique system exists for the VH genes usage that exhibit highly divergent lineages: the VHa allotypes, the Lepus sL lineage and the VHn genes. For the European rabbit (Oryctolagus cuniculus), four VHa lineages have been described, the a1, a2, a3 and a4. For hares (Lepus sp.), one VHa lineage was described, the a2L, as well as a more ancient sL lineage. Both genera use the VHn genes in a low frequency of their VDJ rearrangements. To address the hypothesis that the VH specificities could be associated with different environments, we sequenced VDJ genes from a third leporid genus, Sylvilagus. We found a fifth and equally divergent VHa lineage, the a5, and an ancient lineage, the sS, related to the hares' sL, but failed to obtain VHn genes. These results show that the studied leporids employ different VH lineages in the generation of the antibody repertoire, suggesting that the leporid VH genes are subject to strong selective pressure likely imposed by specific pathogens.

  12. Using effective subnetworks to predict selected properties of gene networks.

    Directory of Open Access Journals (Sweden)

    Gemunu H Gunaratne

    Full Text Available BACKGROUND: Difficulties associated with implementing gene therapy are caused by the complexity of the underlying regulatory networks. The forms of interactions between the hundreds of genes, proteins, and metabolites in these networks are not known very accurately. An alternative approach is to limit consideration to genes on the network. Steady state measurements of these influence networks can be obtained from DNA microarray experiments. However, since they contain a large number of nodes, the computation of influence networks requires a prohibitively large set of microarray experiments. Furthermore, error estimates of the network make verifiable predictions impossible. METHODOLOGY/PRINCIPAL FINDINGS: Here, we propose an alternative approach. Rather than attempting to derive an accurate model of the network, we ask what questions can be addressed using lower dimensional, highly simplified models. More importantly, is it possible to use such robust features in applications? We first identify a small group of genes that can be used to affect changes in other nodes of the network. The reduced effective empirical subnetwork (EES can be computed using steady state measurements on a small number of genetically perturbed systems. We show that the EES can be used to make predictions on expression profiles of other mutants, and to compute how to implement pre-specified changes in the steady state of the underlying biological process. These assertions are verified in a synthetic influence network. We also use previously published experimental data to compute the EES associated with an oxygen deprivation network of E.coli, and use it to predict gene expression levels on a double mutant. The predictions are significantly different from the experimental results for less than of genes. CONCLUSIONS/SIGNIFICANCE: The constraints imposed by gene expression levels of mutants can be used to address a selected set of questions about a gene network.

  13. Interactive Naive Bayesian network: A new approach of constructing gene-gene interaction network for cancer classification.

    Science.gov (United States)

    Tian, Xue W; Lim, Joon S

    2015-01-01

    Naive Bayesian (NB) network classifier is a simple and well-known type of classifier, which can be easily induced from a DNA microarray data set. However, a strong conditional independence assumption of NB network sometimes can lead to weak classification performance. In this paper, we propose a new approach of interactive naive Bayesian (INB) network to weaken the conditional independence of NB network and classify cancers using DNA microarray data set. We selected the differently expressed genes (DEGs) to reduce the dimension of the microarray data set. Then, an interactive parent which has the biggest influence among all DEGs is searched for each DEG. And then we calculate a weight to represent the interactive relationship between a DEG and its parent. Finally, the gene-gene interaction network is constructed. We experimentally test the INB network in terms of classification accuracy using leukemia and colon DNA microarray data sets, then we compare it with the NB network. The INB network can get higher classification accuracies than NB network. And INB network can show the gene-gene interactions visually.

  14. Evolvability and hierarchy in rewired bacterial gene networks

    Science.gov (United States)

    Isalan, Mark; Lemerle, Caroline; Michalodimitrakis, Konstantinos; Beltrao, Pedro; Horn, Carsten; Raineri, Emanuele; Garriga-Canut, Mireia; Serrano, Luis

    2009-01-01

    Sequencing DNA from several organisms has revealed that duplication and drift of existing genes have primarily molded the contents of a given genome. Though the effect of knocking out or over-expressing a particular gene has been studied in many organisms, no study has systematically explored the effect of adding new links in a biological network. To explore network evolvability, we constructed 598 recombinations of promoters (including regulatory regions) with different transcription or σ-factor genes in Escherichia coli, added over a wild-type genetic background. Here we show that ~95% of new networks are tolerated by the bacteria, that very few alter growth, and that expression level correlates with factor position in the wild-type network hierarchy. Most importantly, we find that certain networks consistently survive over the wild-type under various selection pressures. Therefore new links in the network are rarely a barrier for evolution and can even confer a fitness advantage. PMID:18421347

  15. Inferring slowly-changing dynamic gene-regulatory networks.

    Science.gov (United States)

    Wit, Ernst C; Abbruzzo, Antonino

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with l1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset.

  16. Effects of a silenced gene in Boolean network models

    Directory of Open Access Journals (Sweden)

    Emir Haliki

    2017-03-01

    Full Text Available Gene regulation and their regulatory networks are one of the most challenging research problems of computational biology and complexity sciences. Gene regulation is formed by indirect interaction between DNA segments which are protein coding genes to configure the expression level of one another. Prevention of expression of any genes in gene regulation at the levels of transcription or translation indicates the gene silencing event. The present study examined what types of results in gene silencing would bring about in the dynamics of Boolean genetic regulatory mechanisms. The analytical study was performed in gene expression variations of Boolean dynamics first, then the related numerical analysis was simulated in real networks in the literature.

  17. Human population-specific gene expression and transcriptional network modification with polymorphic transposable elements.

    Science.gov (United States)

    Wang, Lu; Rishishwar, Lavanya; Mariño-Ramírez, Leonardo; Jordan, I King

    2016-12-19

    Transposable element (TE) derived sequences are known to contribute to the regulation of the human genome. The majority of known TE-derived regulatory sequences correspond to relatively ancient insertions, which are fixed across human populations. The extent to which human genetic variation caused by recent TE activity leads to regulatory polymorphisms among populations has yet to be thoroughly explored. In this study, we searched for associations between polymorphic TE (polyTE) loci and human gene expression levels using an expression quantitative trait loci (eQTL) approach. We compared locus-specific polyTE insertion genotypes to B cell gene expression levels among 445 individuals from 5 human populations. Numerous human polyTE loci correspond to both cis and trans eQTL, and their regulatory effects are directly related to cell type-specific function in the immune system. PolyTE loci are associated with differences in expression between European and African population groups, and a single polyTE loci is indirectly associated with the expression of numerous genes via the regulation of the B cell-specific transcription factor PAX5 The polyTE-gene expression associations we found indicate that human TE genetic variation can have important phenotypic consequences. Our results reveal that TE-eQTL are involved in population-specific gene regulation as well as transcriptional network modification.

  18. Differentially expressed genes in major depression reside on the periphery of resilient gene coexpression networks

    Directory of Open Access Journals (Sweden)

    Chris eGaiteri

    2011-08-01

    Full Text Available The structure of gene coexpression networks reflects the activation and interaction of multiple cellular systems. Since the pathology of neuropsychiatric disorders is influenced by diverse cellular systems and pathways, we investigated gene coexpression networks in major depression, and searched for putative unifying themes in network connectivity across neuropsychiatric disorders. Specifically, based on the prevalence of the lethality-centrality relationship in disease-related networks, we hypothesized that network changes between control and major depression-related networks would be centered around coexpression hubs, and secondly, that differentially expressed (DE genes would have a characteristic position and connectivity level in those networks. Mathematically, the first hypothesis tests the relationship of differential coexpression to network connectivity, while the second hybrid expression-and-network hypothesis tests the relationship of differential expression to network connectivity. To answer these questions about the potential interaction of coexpression network structure with differential expression, we utilized all available human post-mortem depression-related datasets appropriate for coexpression analysis, which spanned different microarray platforms, cohorts, and brain regions. Similar studies were also performed in an animal model of depression and in schizophrenia and bipolar disorder microarray datasets. We now provide results which consistently support (1 that genes assemble into small-world and scale-free networks in control subjects, (2 that this efficient network topology is largely resilient to changes in depressed subjects, and (3 that DE genes are positioned on the periphery of coexpression networks. Similar results were observed in a mouse model of depression, and in selected bipolar- and schizophrenia-related networks. Finally, we show that baseline expression variability contributes to the propensity of genes to be

  19. Identifying gene regulatory network rewiring using latent differential graphical models.

    Science.gov (United States)

    Tian, Dechao; Gu, Quanquan; Ma, Jian

    2016-09-30

    Gene regulatory networks (GRNs) are highly dynamic among different tissue types. Identifying tissue-specific gene regulation is critically important to understand gene function in a particular cellular context. Graphical models have been used to estimate GRN from gene expression data to distinguish direct interactions from indirect associations. However, most existing methods estimate GRN for a specific cell/tissue type or in a tissue-naive way, or do not specifically focus on network rewiring between different tissues. Here, we describe a new method called Latent Differential Graphical Model (LDGM). The motivation of our method is to estimate the differential network between two tissue types directly without inferring the network for individual tissues, which has the advantage of utilizing much smaller sample size to achieve reliable differential network estimation. Our simulation results demonstrated that LDGM consistently outperforms other Gaussian graphical model based methods. We further evaluated LDGM by applying to the brain and blood gene expression data from the GTEx consortium. We also applied LDGM to identify network rewiring between cancer subtypes using the TCGA breast cancer samples. Our results suggest that LDGM is an effective method to infer differential network using high-throughput gene expression data to identify GRN dynamics among different cellular conditions.

  20. Overview of methods of reverse engineering of gene regulatory networks: Boolean and Bayesian networks

    Directory of Open Access Journals (Sweden)

    Frolova A. O.

    2012-06-01

    Full Text Available Reverse engineering of gene regulatory networks is an intensively studied topic in Systems Biology as it reconstructs regulatory interactions between all genes in the genome in the most complete form. The extreme computational complexity of this problem and lack of thorough reviews on reconstruction methods of gene regulatory network is a significant obstacle to further development of this area. In this article the two most common methods for modeling gene regulatory networks are surveyed: Boolean and Bayesian networks. The mathematical description of each method is given, as well as several algorithmic approaches to modeling gene networks using these methods; the complexity of algorithms and the problems that arise during its implementation are also noted.

  1. Motif Participation by Genes in E. coli Transcriptional Networks

    Directory of Open Access Journals (Sweden)

    Michael eMayo

    2012-09-01

    Full Text Available Motifs are patterns of recurring connections among the genes of genetic networks that occur more frequently than would be expected from randomized networks with the same degree sequence. Although the abundance of certain three-node motifs, such as the feed-forward loop, is positively correlated with a networks’ ability to tolerate moderate disruptions to gene expression, little is known regarding the connectivity of individual genes participating in multiple motifs. Using the transcriptional network of the bacterium Escherichia coli, we investigate this feature by reconstructing the distribution of genes participating in feed-forward loop motifs from its largest connected network component. We contrast these motif participation distributions with those obtained from model networks built using the preferential attachment mechanism employed by many biological and man-made networks. We report that, although some of these model networks support a motif participation distribution that appears qualitatively similar to that obtained from the bacterium Escherichia coli, the probability for a node to support a feed-forward loop motif may instead be strongly influenced by only a few master transcriptional regulators within the network. From these analyses we conclude that such master regulators may be a crucial ingredient to describe coupling among feed-forward loop motifs in transcriptional regulatory networks.

  2. Gene Expression Network Reconstruction by LEP Method Using Microarray Data

    Directory of Open Access Journals (Sweden)

    Na You

    2012-01-01

    Full Text Available Gene expression network reconstruction using microarray data is widely studied aiming to investigate the behavior of a gene cluster simultaneously. Under the Gaussian assumption, the conditional dependence between genes in the network is fully described by the partial correlation coefficient matrix. Due to the high dimensionality and sparsity, we utilize the LEP method to estimate it in this paper. Compared to the existing methods, the LEP reaches the highest PPV with the sensitivity controlled at the satisfactory level. A set of gene expression data from the HapMap project is analyzed for illustration.

  3. Gene transcriptional networks integrate microenvironmental signals in human breast cancer.

    Science.gov (United States)

    Xu, Ren; Mao, Jian-Hua

    2011-04-01

    A significant amount of evidence shows that microenvironmental signals generated from extracellular matrix (ECM) molecules, soluble factors, and cell-cell adhesion complexes cooperate at the extra- and intracellular level. This synergetic action of microenvironmental cues is crucial for normal mammary gland development and breast malignancy. To explore how the microenvironmental genes coordinate in human breast cancer at the genome level, we have performed gene co-expression network analysis in three independent microarray datasets and identified two microenvironment networks in human breast cancer tissues. Network I represents crosstalk and cooperation of ECM microenvironment and soluble factors during breast malignancy. The correlated expression of cytokines, chemokines, and cell adhesion proteins in Network II implicates the coordinated action of these molecules in modulating the immune response in breast cancer tissues. These results suggest that microenvironmental cues are integrated with gene transcriptional networks to promote breast cancer development.

  4. Discovering cancer genes by integrating network and functional properties

    Directory of Open Access Journals (Sweden)

    Davis David P

    2009-09-01

    Full Text Available Abstract Background Identification of novel cancer-causing genes is one of the main goals in cancer research. The rapid accumulation of genome-wide protein-protein interaction (PPI data in humans has provided a new basis for studying the topological features of cancer genes in cellular networks. It is important to integrate multiple genomic data sources, including PPI networks, protein domains and Gene Ontology (GO annotations, to facilitate the identification of cancer genes. Methods Topological features of the PPI network, as well as protein domain compositions, enrichment of gene ontology categories, sequence and evolutionary conservation features were extracted and compared between cancer genes and other genes. The predictive power of various classifiers for identification of cancer genes was evaluated by cross validation. Experimental validation of a subset of the prediction results was conducted using siRNA knockdown and viability assays in human colon cancer cell line DLD-1. Results Cross validation demonstrated advantageous performance of classifiers based on support vector machines (SVMs with the inclusion of the topological features from the PPI network, protein domain compositions and GO annotations. We then applied the trained SVM classifier to human genes to prioritize putative cancer genes. siRNA knock-down of several SVM predicted cancer genes displayed greatly reduced cell viability in human colon cancer cell line DLD-1. Conclusion Topological features of PPI networks, protein domain compositions and GO annotations are good predictors of cancer genes. The SVM classifier integrates multiple features and as such is useful for prioritizing candidate cancer genes for experimental validations.

  5. Gene Regulatory Network Reconstruction Using Conditional Mutual Information

    Directory of Open Access Journals (Sweden)

    Xiaodong Wang

    2008-06-01

    Full Text Available The inference of gene regulatory network from expression data is an important area of research that provides insight to the inner workings of a biological system. The relevance-network-based approaches provide a simple and easily-scalable solution to the understanding of interaction between genes. Up until now, most works based on relevance network focus on the discovery of direct regulation using correlation coefficient or mutual information. However, some of the more complicated interactions such as interactive regulation and coregulation are not easily detected. In this work, we propose a relevance network model for gene regulatory network inference which employs both mutual information and conditional mutual information to determine the interactions between genes. For this purpose, we propose a conditional mutual information estimator based on adaptive partitioning which allows us to condition on both discrete and continuous random variables. We provide experimental results that demonstrate that the proposed regulatory network inference algorithm can provide better performance when the target network contains coregulated and interactively regulated genes.

  6. Screening a novel Na+/H+ antiporter gene from a metagenomic library of halophiles colonizing in the Dagong Ancient Brine Well in China.

    Science.gov (United States)

    Xiang, Wenliang; Zhang, Jie; Li, Lin; Liang, Huazhong; Luo, Hai; Zhao, Jian; Yang, Zhirong; Sun, Qun

    2010-05-01

    Metagenomic DNA libraries constructed from the Dagong Ancient Brine Well were screened for genes with Na(+)/H(+) antiporter activity on the antiporter-deficient Escherichia coli KNabc strain. One clone with a stable Na(+)-resistant phenotype was obtained and its Na(+)/H(+) antiporter gene was sequenced and designated as m-nha. The deduced amino acid sequence of M-Nha protein consists of 523 residues with a calculated molecular weight of 58 147 Da and a pI of 5.50, which is homologous with NhaH from Halobacillus dabanensis D-8(T) (92%) and Halobacillus aidingensis AD-6(T) (86%), and with Nhe2 from Bacillus sp. NRRL B-14911 (64%). It had a hydropathy profile with 10 putative transmembrane domains and a long carboxyl terminal hydrophilic tail of 140 amino acid residues, similar to Nhap from Synechocystis sp. and Aphanothece halophytica, as well as NhaG from Bacillus subtilis. The m-nha gene in the antiporter-negative mutant E. coli KNabc conferred resistance to Na(+) and the ability to grow under alkaline conditions. The difference in amino acid sequence and the putative secondary structure suggested that the m-nha isolated from the Dagong Ancient Brine Well in this study was a novel Na(+)/H(+) antiporter gene.

  7. Antagonistic roles for KNOX1 and KNOX2 genes in patterning the land plant body plan following an ancient gene duplication.

    Science.gov (United States)

    Furumizu, Chihiro; Alvarez, John Paul; Sakakibara, Keiko; Bowman, John L

    2015-02-01

    Neofunctionalization following gene duplication is thought to be one of the key drivers in generating evolutionary novelty. A gene duplication in a common ancestor of land plants produced two classes of KNOTTED-like TALE homeobox genes, class I (KNOX1) and class II (KNOX2). KNOX1 genes are linked to tissue proliferation and maintenance of meristematic potentials of flowering plant and moss sporophytes, and modulation of KNOX1 activity is implicated in contributing to leaf shape diversity of flowering plants. While KNOX2 function has been shown to repress the gametophytic (haploid) developmental program during moss sporophyte (diploid) development, little is known about KNOX2 function in flowering plants, hindering syntheses regarding the relationship between two classes of KNOX genes in the context of land plant evolution. Arabidopsis plants harboring loss-of-function KNOX2 alleles exhibit impaired differentiation of all aerial organs and have highly complex leaves, phenocopying gain-of-function KNOX1 alleles. Conversely, gain-of-function KNOX2 alleles in conjunction with a presumptive heterodimeric BELL TALE homeobox partner suppressed SAM activity in Arabidopsis and reduced leaf complexity in the Arabidopsis relative Cardamine hirsuta, reminiscent of loss-of-function KNOX1 alleles. Little evidence was found indicative of epistasis or mutual repression between KNOX1 and KNOX2 genes. KNOX proteins heterodimerize with BELL TALE homeobox proteins to form functional complexes, and contrary to earlier reports based on in vitro and heterologous expression, we find high selectivity between KNOX and BELL partners in vivo. Thus, KNOX2 genes confer opposing activities rather than redundant roles with KNOX1 genes, and together they act to direct the development of all above-ground organs of the Arabidopsis sporophyte. We infer that following the KNOX1/KNOX2 gene duplication in an ancestor of land plants, neofunctionalization led to evolution of antagonistic biochemical

  8. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans......, archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when...

  9. Indeterminacy of reverse engineering of Gene Regulatory Networks: the curse of gene elasticity.

    Directory of Open Access Journals (Sweden)

    Arun Krishnan

    Full Text Available BACKGROUND: Gene Regulatory Networks (GRNs have become a major focus of interest in recent years. A number of reverse engineering approaches have been developed to help uncover the regulatory networks giving rise to the observed gene expression profiles. However, this is an overspecified problem due to the fact that more than one genotype (network wiring can give rise to the same phenotype. We refer to this phenomenon as "gene elasticity." In this work, we study the effect of this particular problem on the pure, data-driven inference of gene regulatory networks. METHODOLOGY: We simulated a four-gene network in order to produce "data" (protein levels that we use in lieu of real experimental data. We then optimized the network connections between the four genes with a view to obtain the original network that gave rise to the data. We did this for two different cases: one in which only the network connections were optimized and the other in which both the network connections as well as the kinetic parameters (given as reaction probabilities in our case were estimated. We observed that multiple genotypes gave rise to very similar protein levels. Statistical experimentation indicates that it is impossible to differentiate between the different networks on the basis of both equilibrium as well as dynamic data. CONCLUSIONS: We show explicitly that reverse engineering of GRNs from pure expression data is an indeterminate problem. Our results suggest the unsuitability of an inferential, purely data-driven approach for the reverse engineering transcriptional networks in the case of gene regulatory networks displaying a certain level of complexity.

  10. Gene regulation: hacking the network on a sugar high.

    Science.gov (United States)

    Ellis, Tom; Wang, Xiao; Collins, James J

    2008-04-11

    In a recent issue of Molecular Cell, Kaplan et al. (2008) determine the input functions for 19 E. coli sugar-utilization genes by using a two-dimensional high-throughput approach. The resulting input-function map reveals that gene network regulation follows non-Boolean, and often nonmonotonic, logic.

  11. Gene-based and semantic structure of the Gene Ontology as a complex network

    Science.gov (United States)

    Coronnello, Claudia; Tumminello, Michele; Miccichè, Salvatore

    2016-09-01

    The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The Gene Ontology (GO) is constantly evolving over time. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. Here we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium. Moreover, the GO is a natural example of bipartite network of terms and genes. Here we are interested in studying the properties of the projected network of terms, i.e. a gene-based weighted network of GO terms, in which a link between any two terms is set if at least one gene is annotated in both terms. One aim of the present paper is to compare the structural properties of the semantic and the gene-based network. The relative importance of terms is very similar in the two networks, but the community structure changes. We show that in some cases GO terms that appear to be distinct from a semantic point of view are instead connected, and appear in the same community when considering their gene content. The identification of such gene-based communities of terms might therefore be the basis of a simple protocol aiming at improving the semantic structure of GO. Information about terms that share large gene content might also be important from a biomedical point of view, as it might reveal how genes over-expressed in a certain term also affect other biological processes, molecular functions and cellular components not directly linked according to GO semantics.

  12. Optimal finite horizon control in gene regulatory networks

    Science.gov (United States)

    Liu, Qiuli

    2013-06-01

    As a paradigm for modeling gene regulatory networks, probabilistic Boolean networks (PBNs) form a subclass of Markov genetic regulatory networks. To date, many different stochastic optimal control approaches have been developed to find therapeutic intervention strategies for PBNs. A PBN is essentially a collection of constituent Boolean networks via a probability structure. Most of the existing works assume that the probability structure for Boolean networks selection is known. Such an assumption cannot be satisfied in practice since the presence of noise prevents the probability structure from being accurately determined. In this paper, we treat a case in which we lack the governing probability structure for Boolean network selection. Specifically, in the framework of PBNs, the theory of finite horizon Markov decision process is employed to find optimal constituent Boolean networks with respect to the defined objective functions. In order to illustrate the validity of our proposed approach, an example is also displayed.

  13. Linear control theory for gene network modeling.

    Science.gov (United States)

    Shin, Yong-Jun; Bleris, Leonidas

    2010-09-16

    Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain) and linear state-space (time domain) can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.

  14. Validation of Gene Regulatory Network Inference Based on Controllability

    Directory of Open Access Journals (Sweden)

    Edward eDougherty

    2013-12-01

    Full Text Available There are two distinct issues regarding network validation: (1 Does an inferred network provide good predictions relative to experimental data? (2 Does a network inference algorithm applied within a certain network model framework yield networks that are accurate relative to some criterion of goodness? The first issue concerns scientific validation and the second concerns algorithm validation. In this paper we consider inferential validation relative to controllability; that is, if an inference procedure is applied to synthetic data generated from a gene regulatory network and an intervention procedure is designed on the inferred network, how well does it perform on the true network? The reasoning behind such a criterion is that, if our purpose is to use gene regulatory networks to design therapeutic intervention strategies, then we are not concerned with network fidelity, per se, but only with our ability to design effective interventions based on the inferred network. We will consider the problem from the perspectives of stationary control, which involves designing a control policy to be applied over time based on the current state of the network, with the decision procedure itself being time independent. {The objective of a control policy is to optimally reduce the total steady-state probability mass of the undesirable states (phenotypes, which is equivalent to optimally increasing the total steady-state mass of the desirable states. Based on this criterion we compare several proposed network inference procedures. We will see that inference procedure psi may perform poorer than inference procedure xi relative to inferring the full network structure but perform better than xi relative to controllability. Hence, when one is aiming at a specific application, it may be wise to use an objective-based measure of inference validity.

  15. A complex network analysis of hypertension-related genes

    Science.gov (United States)

    Wang, Huan; Xu, Chuan-Yun; Hu, Jing-Bo; Cao, Ke-Fei

    2014-01-01

    In this paper, a network of hypertension-related genes is constructed by analyzing the correlations of gene expression data among the Dahl salt-sensitive rat and two consomic rat strains. The numerical calculations show that this sparse and assortative network has small-world and scale-free properties. Further, 16 key hub genes (Col4a1, Lcn2, Cdk4, etc.) are determined by introducing an integrated centrality and have been confirmed by biological/medical research to play important roles in hypertension.

  16. Tamil merchant in ancient Mesopotamia.

    Directory of Open Access Journals (Sweden)

    Malliya Gounder Palanichamy

    Full Text Available Recent analyses of ancient Mesopotamian mitochondrial genomes have suggested a genetic link between the Indian subcontinent and Mesopotamian civilization. There is no consensus on the origin of the ancient Mesopotamians. They may be descendants of migrants, who founded regional Mesopotamian groups like that of Terqa or they may be merchants who were involved in trans Mesopotamia trade. To identify the Indian source population showing linkage to the ancient Mesopotamians, we screened a total of 15,751 mitochondrial DNAs (11,432 from the literature and 4,319 from this study representing all major populations of India. Our results although suggest that south India (Tamil Nadu and northeast India served as the source of the ancient Mesopotamian mtDNA gene pool, mtDNA of these ancient Mesopotamians probably contributed by Tamil merchants who were involved in the Indo-Roman trade.

  17. Tamil merchant in ancient Mesopotamia.

    Science.gov (United States)

    Palanichamy, Malliya Gounder; Mitra, Bikash; Debnath, Monojit; Agrawal, Suraksha; Chaudhuri, Tapas Kumar; Zhang, Ya-Ping

    2014-01-01

    Recent analyses of ancient Mesopotamian mitochondrial genomes have suggested a genetic link between the Indian subcontinent and Mesopotamian civilization. There is no consensus on the origin of the ancient Mesopotamians. They may be descendants of migrants, who founded regional Mesopotamian groups like that of Terqa or they may be merchants who were involved in trans Mesopotamia trade. To identify the Indian source population showing linkage to the ancient Mesopotamians, we screened a total of 15,751 mitochondrial DNAs (11,432 from the literature and 4,319 from this study) representing all major populations of India. Our results although suggest that south India (Tamil Nadu) and northeast India served as the source of the ancient Mesopotamian mtDNA gene pool, mtDNA of these ancient Mesopotamians probably contributed by Tamil merchants who were involved in the Indo-Roman trade.

  18. The incorporation of epigenetics in artificial gene regulatory networks.

    Science.gov (United States)

    Turner, Alexander P; Lones, Michael A; Fuente, Luis A; Stepney, Susan; Caves, Leo S D; Tyrrell, Andy M

    2013-05-01

    Artificial gene regulatory networks are computational models that draw inspiration from biological networks of gene regulation. Since their inception they have been used to infer knowledge about gene regulation and as methods of computation. These computational models have been shown to possess properties typically found in the biological world, such as robustness and self organisation. Recently, it has become apparent that epigenetic mechanisms play an important role in gene regulation. This paper describes a new model, the Artificial Epigenetic Regulatory Network (AERN) which builds upon existing models by adding an epigenetic control layer. Our results demonstrate that AERNs are more adept at controlling multiple opposing trajectories when applied to a chaos control task within a conservative dynamical system, suggesting that AERNs are an interesting area for further investigation.

  19. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen;

    2015-01-01

    , archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when...

  20. Solution of the quasispecies model for an arbitrary gene network

    Science.gov (United States)

    Tannenbaum, Emmanuel; Shakhnovich, Eugene I.

    2004-08-01

    In this paper, we study the equilibrium behavior of Eigen’s quasispecies equations for an arbitrary gene network. We consider a genome consisting of N genes, so that the full genome sequence σ may be written as σ=σ1σ2⋯σN , where σi are sequences of individual genes. We assume a single fitness peak model for each gene, so that gene i has some “master” sequence σi,0 for which it is functioning. The fitness landscape is then determined by which genes in the genome are functioning and which are not. The equilibrium behavior of this model may be solved in the limit of infinite sequence length. The central result is that, instead of a single error catastrophe, the model exhibits a series of localization to delocalization transitions, which we term an “error cascade.” As the mutation rate is increased, the selective advantage for maintaining functional copies of certain genes in the network disappears, and the population distribution delocalizes over the corresponding sequence spaces. The network goes through a series of such transitions, as more and more genes become inactivated, until eventually delocalization occurs over the entire genome space, resulting in a final error catastrophe. This model provides a criterion for determining the conditions under which certain genes in a genome will lose functionality due to genetic drift. It also provides insight into the response of gene networks to mutagens. In particular, it suggests an approach for determining the relative importance of various genes to the fitness of an organism, in a more accurate manner than the standard “deletion set” method. The results in this paper also have implications for mutational robustness and what C.O. Wilke termed “survival of the flattest.”

  1. A spruce gene map infers ancient plant genome reshuffling and subsequent slow evolution in the gymnosperm lineage leading to extant conifers

    Directory of Open Access Journals (Sweden)

    Pavy Nathalie

    2012-10-01

    Full Text Available Abstract Background Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling. Results To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago. Conclusions Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed

  2. Reconstruction of Gene Regulatory Networks Based on Two-Stage Bayesian Network Structure Learning Algorithm

    Institute of Scientific and Technical Information of China (English)

    Gui-xia Liu; Wei Feng; Han Wang; Lei Liu; Chun-guang Zhou

    2009-01-01

    In the post-genomic biology era, the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system, and it has been a challenging task in bioinformatics. The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages, but how to determine the network structure and parameters is still important to be explored. This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network .The new algorithm is evaluated with the use of both simulated and yeast cell cycle data. The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.

  3. Associating genes and protein complexes with disease via network propagation.

    Directory of Open Access Journals (Sweden)

    Oron Vanunu

    2010-01-01

    Full Text Available A fundamental challenge in human health is the identification of disease-causing genes. Recently, several studies have tackled this challenge via a network-based approach, motivated by the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein or functional interactions. However, most of these approaches use only local network information in the inference process and are restricted to inferring single gene associations. Here, we provide a global, network-based method for prioritizing disease genes and inferring protein complex associations, which we call PRINCE. The method is based on formulating constraints on the prioritization function that relate to its smoothness over the network and usage of prior information. We exploit this function to predict not only genes but also protein complex associations with a disease of interest. We test our method on gene-disease association data, evaluating both the prioritization achieved and the protein complexes inferred. We show that our method outperforms extant approaches in both tasks. Using data on 1,369 diseases from the OMIM knowledgebase, our method is able (in a cross validation setting to rank the true causal gene first for 34% of the diseases, and infer 139 disease-related complexes that are highly coherent in terms of the function, expression and conservation of their member proteins. Importantly, we apply our method to study three multi-factorial diseases for which some causal genes have been found already: prostate cancer, alzheimer and type 2 diabetes mellitus. PRINCE's predictions for these diseases highly match the known literature, suggesting several novel causal genes and protein complexes for further investigation.

  4. Stability depends on positive autoregulation in Boolean gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Ricardo Pinho

    2014-11-01

    Full Text Available Network motifs have been identified as building blocks of regulatory networks, including gene regulatory networks (GRNs. The most basic motif, autoregulation, has been associated with bistability (when positive and with homeostasis and robustness to noise (when negative, but its general importance in network behavior is poorly understood. Moreover, how specific autoregulatory motifs are selected during evolution and how this relates to robustness is largely unknown. Here, we used a class of GRN models, Boolean networks, to investigate the relationship between autoregulation and network stability and robustness under various conditions. We ran evolutionary simulation experiments for different models of selection, including mutation and recombination. Each generation simulated the development of a population of organisms modeled by GRNs. We found that stability and robustness positively correlate with autoregulation; in all investigated scenarios, stable networks had mostly positive autoregulation. Assuming biological networks correspond to stable networks, these results suggest that biological networks should often be dominated by positive autoregulatory loops. This seems to be the case for most studied eukaryotic transcription factor networks, including those in yeast, flies and mammals.

  5. Stability Depends on Positive Autoregulation in Boolean Gene Regulatory Networks

    Science.gov (United States)

    Pinho, Ricardo; Garcia, Victor; Irimia, Manuel; Feldman, Marcus W.

    2014-01-01

    Network motifs have been identified as building blocks of regulatory networks, including gene regulatory networks (GRNs). The most basic motif, autoregulation, has been associated with bistability (when positive) and with homeostasis and robustness to noise (when negative), but its general importance in network behavior is poorly understood. Moreover, how specific autoregulatory motifs are selected during evolution and how this relates to robustness is largely unknown. Here, we used a class of GRN models, Boolean networks, to investigate the relationship between autoregulation and network stability and robustness under various conditions. We ran evolutionary simulation experiments for different models of selection, including mutation and recombination. Each generation simulated the development of a population of organisms modeled by GRNs. We found that stability and robustness positively correlate with autoregulation; in all investigated scenarios, stable networks had mostly positive autoregulation. Assuming biological networks correspond to stable networks, these results suggest that biological networks should often be dominated by positive autoregulatory loops. This seems to be the case for most studied eukaryotic transcription factor networks, including those in yeast, flies and mammals. PMID:25375153

  6. Stable Gene Regulatory Network Modeling From Steady-State Data

    Directory of Open Access Journals (Sweden)

    Joy Edward Larvie

    2016-04-01

    Full Text Available Gene regulatory networks represent an abstract mapping of gene regulations in living cells. They aim to capture dependencies among molecular entities such as transcription factors, proteins and metabolites. In most applications, the regulatory network structure is unknown, and has to be reverse engineered from experimental data consisting of expression levels of the genes usually measured as messenger RNA concentrations in microarray experiments. Steady-state gene expression data are obtained from measurements of the variations in expression activity following the application of small perturbations to equilibrium states in genetic perturbation experiments. In this paper, the least absolute shrinkage and selection operator-vector autoregressive (LASSO-VAR originally proposed for the analysis of economic time series data is adapted to include a stability constraint for the recovery of a sparse and stable regulatory network that describes data obtained from noisy perturbation experiments. The approach is applied to real experimental data obtained for the SOS pathway in Escherichia coli and the cell cycle pathway for yeast Saccharomyces cerevisiae. Significant features of this method are the ability to recover networks without inputting prior knowledge of the network topology, and the ability to be efficiently applied to large scale networks due to the convex nature of the method.

  7. Identifying gene networks underlying the neurobiology of ethanol and alcoholism.

    Science.gov (United States)

    Wolen, Aaron R; Miles, Michael F

    2012-01-01

    For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.

  8. Sequencing of Pax6 loci from the elephant shark reveals a family of Pax6 genes in vertebrate genomes, forged by ancient duplications and divergences.

    Directory of Open Access Journals (Sweden)

    Vydianathan Ravi

    Full Text Available Pax6 is a developmental control gene essential for eye development throughout the animal kingdom. In addition, Pax6 plays key roles in other parts of the CNS, olfactory system, and pancreas. In mammals a single Pax6 gene encoding multiple isoforms delivers these pleiotropic functions. Here we provide evidence that the genomes of many other vertebrate species contain multiple Pax6 loci. We sequenced Pax6-containing BACs from the cartilaginous elephant shark (Callorhinchus milii and found two distinct Pax6 loci. Pax6.1 is highly similar to mammalian Pax6, while Pax6.2 encodes a paired-less Pax6. Using synteny relationships, we identify homologs of this novel paired-less Pax6.2 gene in lizard and in frog, as well as in zebrafish and in other teleosts. In zebrafish two full-length Pax6 duplicates were known previously, originating from the fish-specific genome duplication (FSGD and expressed in divergent patterns due to paralog-specific loss of cis-elements. We show that teleosts other than zebrafish also maintain duplicate full-length Pax6 loci, but differences in gene and regulatory domain structure suggest that these Pax6 paralogs originate from a more ancient duplication event and are hence renamed as Pax6.3. Sequence comparisons between mammalian and elephant shark Pax6.1 loci highlight the presence of short- and long-range conserved noncoding elements (CNEs. Functional analysis demonstrates the ancient role of long-range enhancers for Pax6 transcription. We show that the paired-less Pax6.2 ortholog in zebrafish is expressed specifically in the developing retina. Transgenic analysis of elephant shark and zebrafish Pax6.2 CNEs with homology to the mouse NRE/Pα internal promoter revealed highly specific retinal expression. Finally, morpholino depletion of zebrafish Pax6.2 resulted in a "small eye" phenotype, supporting a role in retinal development. In summary, our study reveals that the pleiotropic functions of Pax6 in vertebrates are served by

  9. Cloning of human RTEF-1, a transcriptional enhancer factor-1-related gene preferentially expressed in skeletal muscle: evidence for an ancient multigene family.

    Science.gov (United States)

    Stewart, A F; Richard, C W; Suzow, J; Stephan, D; Weremowicz, S; Morton, C C; Adra, C N

    1996-10-01

    Transcriptional Enhancer Factor-1 (TEF-1) is a transcription factor required for cardiac muscle gene activation. Since ablation of TEF-1 does not abolish cardiac gene expression, we sought to identify a human gene related to TEF-1 (RTEF-1) that might also participate in cardiac gene regulation. A human heart cDNA library was screened to obtain a full-length RTEF-1 cDNA. Fluorescence in situ hybridization assigned the RTEF-1 gene to chromosome 12p13.2-p13.3. In contrast, PCR screening of human/rodent cell hybrid panels identified TEF-1 on chromosome 11p15.2, between D11S1315 and D11S1334, extending a region of known synteny between human chromosomes 11 and 12 and arguing for an ancient divergence between these two closely related genes. Northern blot analysis revealed a striking similarity in the tissue distribution of RTEF-1 and TEF-1 mRNAs; skeletal muscle showed the highest abundance of both mRNAs, with lower levels detected in pancreas, placenta, and heart. Phylogenetic analysis of all known TEF-1-related proteins identified human RTEF-1 as one of four vertebrate members of this multigene family and further suggests that these genes diverged in the earliest metazoan ancestors.

  10. Modeling Gene Networks in Saccharomyces cerevisiae Based on Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Yulin Zhang

    2015-01-01

    Full Text Available Detailed and innovative analysis of gene regulatory network structures may reveal novel insights to biological mechanisms. Here we study how gene regulatory network in Saccharomyces cerevisiae can differ under aerobic and anaerobic conditions. To achieve this, we discretized the gene expression profiles and calculated the self-entropy of down- and upregulation of gene expression as well as joint entropy. Based on these quantities the uncertainty coefficient was calculated for each gene triplet, following which, separate gene logic networks were constructed for the aerobic and anaerobic conditions. Four structural parameters such as average degree, average clustering coefficient, average shortest path, and average betweenness were used to compare the structure of the corresponding aerobic and anaerobic logic networks. Five genes were identified to be putative key components of the two energy metabolisms. Furthermore, community analysis using the Newman fast algorithm revealed two significant communities for the aerobic but only one for the anaerobic network. David Gene Functional Classification suggests that, under aerobic conditions, one such community reflects the cell cycle and cell replication, while the other one is linked to the mitochondrial respiratory chain function.

  11. Yin and Yang of disease genes and death genes between reciprocally scale-free biological networks.

    Science.gov (United States)

    Han, Hyun Wook; Ohn, Jung Hun; Moon, Jisook; Kim, Ju Han

    2013-11-01

    Biological networks often show a scale-free topology with node degree following a power-law distribution. Lethal genes tend to form functional hubs, whereas non-lethal disease genes are located at the periphery. Uni-dimensional analyses, however, are flawed. We created and investigated two distinct scale-free networks; a protein-protein interaction (PPI) and a perturbation sensitivity network (PSN). The hubs of both networks exhibit a low molecular evolutionary rate (P genes but not with disease genes, whereas PSN hubs are highly enriched with disease genes and drug targets but not with lethal genes. PPI hub genes are enriched with essential cellular processes, but PSN hub genes are enriched with environmental interaction processes, having more TATA boxes and transcription factor binding sites. It is concluded that biological systems may balance internal growth signaling and external stress signaling by unifying the two opposite scale-free networks that are seemingly opposite to each other but work in concert between death and disease.

  12. Antagonistic roles for KNOX1 and KNOX2 genes in patterning the land plant body plan following an ancient gene duplication.

    Directory of Open Access Journals (Sweden)

    Chihiro Furumizu

    2015-02-01

    Full Text Available Neofunctionalization following gene duplication is thought to be one of the key drivers in generating evolutionary novelty. A gene duplication in a common ancestor of land plants produced two classes of KNOTTED-like TALE homeobox genes, class I (KNOX1 and class II (KNOX2. KNOX1 genes are linked to tissue proliferation and maintenance of meristematic potentials of flowering plant and moss sporophytes, and modulation of KNOX1 activity is implicated in contributing to leaf shape diversity of flowering plants. While KNOX2 function has been shown to repress the gametophytic (haploid developmental program during moss sporophyte (diploid development, little is known about KNOX2 function in flowering plants, hindering syntheses regarding the relationship between two classes of KNOX genes in the context of land plant evolution. Arabidopsis plants harboring loss-of-function KNOX2 alleles exhibit impaired differentiation of all aerial organs and have highly complex leaves, phenocopying gain-of-function KNOX1 alleles. Conversely, gain-of-function KNOX2 alleles in conjunction with a presumptive heterodimeric BELL TALE homeobox partner suppressed SAM activity in Arabidopsis and reduced leaf complexity in the Arabidopsis relative Cardamine hirsuta, reminiscent of loss-of-function KNOX1 alleles. Little evidence was found indicative of epistasis or mutual repression between KNOX1 and KNOX2 genes. KNOX proteins heterodimerize with BELL TALE homeobox proteins to form functional complexes, and contrary to earlier reports based on in vitro and heterologous expression, we find high selectivity between KNOX and BELL partners in vivo. Thus, KNOX2 genes confer opposing activities rather than redundant roles with KNOX1 genes, and together they act to direct the development of all above-ground organs of the Arabidopsis sporophyte. We infer that following the KNOX1/KNOX2 gene duplication in an ancestor of land plants, neofunctionalization led to evolution of antagonistic

  13. Visualizing Gene - Interactions within the Rice and Maize Network

    Science.gov (United States)

    Sampong, A.; Feltus, A.; Smith, M.

    2014-12-01

    The purpose of this research was to design a simpler visualization tool for comparing or viewing gene interaction graphs in systems biology. This visualization tool makes it possible and easier for a researcher to visualize the biological metadata of a plant and interact with the graph on a webpage. Currently available visualization software like Cytoscape and Walrus are difficult to interact with and do not scale effectively for large data sets, limiting the ability to visualize interactions within a biological system. The visualization tool developed is useful for viewing and interpreting the dataset of a gene interaction network. The graph layout drawn by this visualization tool is an improvement from the previous method of comparing lines of genes in two separate data files to, now having the ability to visually see the layout of the gene networks and how the two systems are related. The graph layout presented by the visualization tool draws a graph of the sample rice and maize gene networks, linking the common genes found in both plants and highlighting the functions served by common genes from each plant. The success of this visualization tool will enable Dr. Feltus to continue his investigations and draw conclusions on the biological evolution of the sorghum plant as well. REU Funded by NSF ACI Award 1359223 Vetria L. Byrd, PI

  14. Transcriptional control in the segmentation gene network of Drosophila.

    Directory of Open Access Journals (Sweden)

    Mark D Schroeder

    2004-09-01

    Full Text Available The segmentation gene network of Drosophila consists of maternal and zygotic factors that generate, by transcriptional (cross- regulation, expression patterns of increasing complexity along the anterior-posterior axis of the embryo. Using known binding site information for maternal and zygotic gap transcription factors, the computer algorithm Ahab recovers known segmentation control elements (modules with excellent success and predicts many novel modules within the network and genome-wide. We show that novel module predictions are highly enriched in the network and typically clustered proximal to the promoter, not only upstream, but also in intronic space and downstream. When placed upstream of a reporter gene, they consistently drive patterned blastoderm expression, in most cases faithfully producing one or more pattern elements of the endogenous gene. Moreover, we demonstrate for the entire set of known and newly validated modules that Ahab's prediction of binding sites correlates well with the expression patterns produced by the modules, revealing basic rules governing their composition. Specifically, we show that maternal factors consistently act as activators and that gap factors act as repressors, except for the bimodal factor Hunchback. Our data suggest a simple context-dependent rule for its switch from repressive to activating function. Overall, the composition of modules appears well fitted to the spatiotemporal distribution of their positive and negative input factors. Finally, by comparing Ahab predictions with different categories of transcription factor input, we confirm the global regulatory structure of the segmentation gene network, but find odd skipped behaving like a primary pair-rule gene. The study expands our knowledge of the segmentation gene network by increasing the number of experimentally tested modules by 50%. For the first time, the entire set of validated modules is analyzed for binding site composition under a

  15. Functional analysis of prognostic gene expression network genes in metastatic breast cancer models.

    Directory of Open Access Journals (Sweden)

    Thomas R Geiger

    Full Text Available Identification of conserved co-expression networks is a useful tool for clustering groups of genes enriched for common molecular or cellular functions [1]. The relative importance of genes within networks can frequently be inferred by the degree of connectivity, with those displaying high connectivity being significantly more likely to be associated with specific molecular functions [2]. Previously we utilized cross-species network analysis to identify two network modules that were significantly associated with distant metastasis free survival in breast cancer. Here, we validate one of the highly connected genes as a metastasis associated gene. Tpx2, the most highly connected gene within a proliferation network specifically prognostic for estrogen receptor positive (ER+ breast cancers, enhances metastatic disease, but in a tumor autonomous, proliferation-independent manner. Histologic analysis suggests instead that variation of TPX2 levels within disseminated tumor cells may influence the transition between dormant to actively proliferating cells in the secondary site. These results support the co-expression network approach for identification of new metastasis-associated genes to provide new information regarding the etiology of breast cancer progression and metastatic disease.

  16. Learning Gene Regulatory Networks Computationally from Gene Expression Data Using Weighted Consensus

    KAUST Repository

    Fujii, Chisato

    2015-04-16

    Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.

  17. Listening to the noise: random fluctuations reveal gene network parameters

    Energy Technology Data Exchange (ETDEWEB)

    Munsky, Brian [Los Alamos National Laboratory; Khammash, Mustafa [UCSB

    2009-01-01

    The cellular environment is abuzz with noise. The origin of this noise is attributed to the inherent random motion of reacting molecules that take part in gene expression and post expression interactions. In this noisy environment, clonal populations of cells exhibit cell-to-cell variability that frequently manifests as significant phenotypic differences within the cellular population. The stochastic fluctuations in cellular constituents induced by noise can be measured and their statistics quantified. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich source of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. This establishes a potentially powerful approach for the identification of gene networks and offers a new window into the workings of these networks.

  18. Boolean networks using the chi-square test for inferring large-scale gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Lee Jae K

    2007-02-01

    Full Text Available Abstract Background Boolean network (BN modeling is a commonly used method for constructing gene regulatory networks from time series microarray data. However, its major drawback is that its computation time is very high or often impractical to construct large-scale gene networks. We propose a variable selection method that are not only reduces BN computation times significantly but also obtains optimal network constructions by using chi-square statistics for testing the independence in contingency tables. Results Both the computation time and accuracy of the network structures estimated by the proposed method are compared with those of the original BN methods on simulated and real yeast cell cycle microarray gene expression data sets. Our results reveal that the proposed chi-square testing (CST-based BN method significantly improves the computation time, while its ability to identify all the true network mechanisms was effectively the same as that of full-search BN methods. The proposed BN algorithm is approximately 70.8 and 7.6 times faster than the original BN algorithm when the error sizes of the Best-Fit Extension problem are 0 and 1, respectively. Further, the false positive error rate of the proposed CST-based BN algorithm tends to be less than that of the original BN. Conclusion The CST-based BN method dramatically improves the computation time of the original BN algorithm. Therefore, it can efficiently infer large-scale gene regulatory network mechanisms.

  19. Network Analysis of Human Genes Influencing Susceptibility to Mycobacterial Infections.

    Directory of Open Access Journals (Sweden)

    Ettie M Lipner

    Full Text Available Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects.

  20. Ground rules of the pluripotency gene regulatory network.

    KAUST Repository

    Li, Mo

    2017-01-03

    Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.

  1. Random matrix analysis of localization properties of gene coexpression network.

    Science.gov (United States)

    Jalan, Sarika; Solymosi, Norbert; Vattay, Gábor; Li, Baowen

    2010-04-01

    We analyze gene coexpression network under the random matrix theory framework. The nearest-neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range and deviates afterwards. Eigenvector analysis of the network using inverse participation ratio suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets: (a) The nondegenerate part that follows RMT. (b) The nondegenerate part, at both ends and at intermediate eigenvalues, which deviates from RMT and expected to contain information about important nodes in the network. (c) The degenerate part with zero eigenvalue, which fluctuates around RMT-predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties.

  2. The underlying molecular and network level mechanisms in the evolution of robustness in gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Mario Pujato

    Full Text Available Gene regulatory networks show robustness to perturbations. Previous works identified robustness as an emergent property of gene network evolution but the underlying molecular mechanisms are poorly understood. We used a multi-tier modeling approach that integrates molecular sequence and structure information with network architecture and population dynamics. Structural models of transcription factor-DNA complexes are used to estimate relative binding specificities. In this model, mutations in the DNA cause changes on two levels: (a at the sequence level in individual binding sites (modulating binding specificity, and (b at the network level (creating and destroying binding sites. We used this model to dissect the underlying mechanisms responsible for the evolution of robustness in gene regulatory networks. Results suggest that in sparse architectures (represented by short promoters, a mixture of local-sequence and network-architecture level changes are exploited. At the local-sequence level, robustness evolves by decreasing the probabilities of both the destruction of existent and generation of new binding sites. Meanwhile, in highly interconnected architectures (represented by long promoters, robustness evolves almost entirely via network level changes, deleting and creating binding sites that modify the network architecture.

  3. Internal signal stochastic resonance of a synthetic gene network

    Institute of Scientific and Technical Information of China (English)

    WANG; Zhiwei; HOU; Zhonghuai; XIN; Houwen

    2005-01-01

    The dynamics behavior of a synthetic gene network controlled by random noise is investigated using a model proposed recently. The phenomena of noise induced oscillation (NIO) of the protein concentrations and internal signal stochastic resonance (SR) are studied by computer simulation. We also find that there exists an optimal noise intensity that can most favor the occurrence of effective oscillation (EO). Finally we discuss the potential constructive roles of SR on gene expression systems.

  4. Propagation of genetic variation in gene regulatory networks.

    Science.gov (United States)

    Plahte, Erik; Gjuvsland, Arne B; Omholt, Stig W

    2013-08-01

    A future quantitative genetics theory should link genetic variation to phenotypic variation in a causally cohesive way based on how genes actually work and interact. We provide a theoretical framework for predicting and understanding the manifestation of genetic variation in haploid and diploid regulatory networks with arbitrary feedback structures and intra-locus and inter-locus functional dependencies. Using results from network and graph theory, we define propagation functions describing how genetic variation in a locus is propagated through the network, and show how their derivatives are related to the network's feedback structure. Similarly, feedback functions describe the effect of genotypic variation of a locus on itself, either directly or mediated by the network. A simple sign rule relates the sign of the derivative of the feedback function of any locus to the feedback loops involving that particular locus. We show that the sign of the phenotypically manifested interaction between alleles at a diploid locus is equal to the sign of the dominant feedback loop involving that particular locus, in accordance with recent results for a single locus system. Our results provide tools by which one can use observable equilibrium concentrations of gene products to disclose structural properties of the network architecture. Our work is a step towards a theory capable of explaining the pleiotropy and epistasis features of genetic variation in complex regulatory networks as functions of regulatory anatomy and functional location of the genetic variation.

  5. MiRTargetLink--miRNAs, Genes and Interaction Networks.

    Science.gov (United States)

    Hamberg, Maarten; Backes, Christina; Fehlmann, Tobias; Hart, Martin; Meder, Benjamin; Meese, Eckart; Keller, Andreas

    2016-04-14

    Information on miRNA targeting genes is growing rapidly. For high-throughput experiments, but also for targeted analyses of few genes or miRNAs, easy analysis with concise representation of results facilitates the work of life scientists. We developed miRTargetLink, a tool for automating respective analysis procedures that are frequently applied. Input of the web-based solution is either a single gene or single miRNA, but also sets of genes or miRNAs, can be entered. Validated and predicted targets are extracted from databases and an interaction network is presented. Users can select whether predicted targets, experimentally validated targets with strong or weak evidence, or combinations of those are considered. Central genes or miRNAs are highlighted and users can navigate through the network interactively. To discover the most relevant biochemical processes influenced by the target network, gene set analysis and miRNA set analysis are integrated. As a showcase for miRTargetLink, we analyze targets of five cardiac miRNAs. miRTargetLink is freely available without restrictions at www.ccb.uni-saarland.de/mirtargetlink.

  6. Identifying glioblastoma gene networks based on hypergeometric test analysis.

    Directory of Open Access Journals (Sweden)

    Vasileios Stathias

    Full Text Available Patient specific therapy is emerging as an important possibility for many cancer patients. However, to identify such therapies it is essential to determine the genomic and transcriptional alterations present in one tumor relative to control samples. This presents a challenge since use of a single sample precludes many standard statistical analysis techniques. We reasoned that one means of addressing this issue is by comparing transcriptional changes in one tumor with those observed in a large cohort of patients analyzed by The Cancer Genome Atlas (TCGA. To test this directly, we devised a bioinformatics pipeline to identify differentially expressed genes in tumors resected from patients suffering from the most common malignant adult brain tumor, glioblastoma (GBM. We performed RNA sequencing on tumors from individual GBM patients and filtered the results through the TCGA database in order to identify possible gene networks that are overrepresented in GBM samples relative to controls. Importantly, we demonstrate that hypergeometric-based analysis of gene pairs identifies gene networks that validate experimentally. These studies identify a putative workflow for uncovering differentially expressed patient specific genes and gene networks for GBM and other cancers.

  7. Lists2Networks: Integrated analysis of gene/protein lists

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2010-02-01

    Full Text Available Abstract Background Systems biologists are faced with the difficultly of analyzing results from large-scale studies that profile the activity of many genes, RNAs and proteins, applied in different experiments, under different conditions, and reported in different publications. To address this challenge it is desirable to compare the results from different related studies such as mRNA expression microarrays, genome-wide ChIP-X, RNAi screens, proteomics and phosphoproteomics experiments in a coherent global framework. In addition, linking high-content multilayered experimental results with prior biological knowledge can be useful for identifying functional themes and form novel hypotheses. Results We present Lists2Networks, a web-based system that allows users to upload lists of mammalian genes/proteins onto a server-based program for integrated analysis. The system includes web-based tools to manipulate lists with different set operations, to expand lists using existing mammalian networks of protein-protein interactions, co-expression correlation, or background knowledge co-annotation correlation, as well as to apply gene-list enrichment analyses against many gene-list libraries of prior biological knowledge such as pathways, gene ontology terms, kinase-substrate, microRNA-mRAN, and protein-protein interactions, metabolites, and protein domains. Such analyses can be applied to several lists at once against many prior knowledge libraries of gene-lists associated with specific annotations. The system also contains features that allow users to export networks and share lists with other users of the system. Conclusions Lists2Networks is a user friendly web-based software system expected to significantly ease the computational analysis process for experimental systems biologists employing high-throughput experiments at multiple layers of regulation. The system is freely available at http://www.lists2networks.org.

  8. Gene-network analysis identifies susceptibility genes related to glycobiology in autism.

    Directory of Open Access Journals (Sweden)

    Bert van der Zwaag

    Full Text Available The recent identification of copy-number variation in the human genome has opened up new avenues for the discovery of positional candidate genes underlying complex genetic disorders, especially in the field of psychiatric disease. One major challenge that remains is pinpointing the susceptibility genes in the multitude of disease-associated loci. This challenge may be tackled by reconstruction of functional gene-networks from the genes residing in these loci. We applied this approach to autism spectrum disorder (ASD, and identified the copy-number changes in the DNA of 105 ASD patients and 267 healthy individuals with Illumina Humanhap300 Beadchips. Subsequently, we used a human reconstructed gene-network, Prioritizer, to rank candidate genes in the segmental gains and losses in our autism cohort. This analysis highlighted several candidate genes already known to be mutated in cognitive and neuropsychiatric disorders, including RAI1, BRD1, and LARGE. In addition, the LARGE gene was part of a sub-network of seven genes functioning in glycobiology, present in seven copy-number changes specifically identified in autism patients with limited co-morbidity. Three of these seven copy-number changes were de novo in the patients. In autism patients with a complex phenotype and healthy controls no such sub-network was identified. An independent systematic analysis of 13 published autism susceptibility loci supports the involvement of genes related to glycobiology as we also identified the same or similar genes from those loci. Our findings suggest that the occurrence of genomic gains and losses of genes associated with glycobiology are important contributors to the development of ASD.

  9. Annotation of gene function in citrus using gene expression information and co-expression networks

    OpenAIRE

    Wong, Darren CJ; Sweetman, Crystal; Ford, Christopher M.

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related bi...

  10. Graphlet Based Metrics for the Comparison of Gene Regulatory Networks

    Science.gov (United States)

    Martin, Alberto J. M.; Dominguez, Calixto; Contreras-Riquelme, Sebastián; Holmes, David S.; Perez-Acle, Tomas

    2016-01-01

    Understanding the control of gene expression remains one of the main challenges in the post-genomic era. Accordingly, a plethora of methods exists to identify variations in gene expression levels. These variations underlay almost all relevant biological phenomena, including disease and adaptation to environmental conditions. However, computational tools to identify how regulation changes are scarce. Regulation of gene expression is usually depicted in the form of a gene regulatory network (GRN). Structural changes in a GRN over time and conditions represent variations in the regulation of gene expression. Like other biological networks, GRNs are composed of basic building blocks called graphlets. As a consequence, two new metrics based on graphlets are proposed in this work: REConstruction Rate (REC) and REC Graphlet Degree (RGD). REC determines the rate of graphlet similarity between different states of a network and RGD identifies the subset of nodes with the highest topological variation. In other words, RGD discerns how th GRN was rewired. REC and RGD were used to compare the local structure of nodes in condition-specific GRNs obtained from gene expression data of Escherichia coli, forming biofilms and cultured in suspension. According to our results, most of the network local structure remains unaltered in the two compared conditions. Nevertheless, changes reported by RGD necessarily imply that a different cohort of regulators (i.e. transcription factors (TFs)) appear on the scene, shedding light on how the regulation of gene expression occurs when E. coli transits from suspension to biofilm. Consequently, we propose that both metrics REC and RGD should be adopted as a quantitative approach to conduct differential analyses of GRNs. A tool that implements both metrics is available as an on-line web server (http://dlab.cl/loto). PMID:27695050

  11. Combination of Neuro-Fuzzy Network Models with Biological Knowledge for Reconstructing Gene Regulatory Networks

    Institute of Scientific and Technical Information of China (English)

    Guixia Liu; Lei Liu; Chunyu Liu; Ming Zheng; Lanying Su; Chunguang Zhou

    2011-01-01

    Inferring gene regulatory networks from large-scale expression data is an important topic in both cellular systems and computational biology. The inference of regulators might be the core factor for understanding actual regulatory conditions in gene regulatory networks, especially when strong regulators do work significantly, in this paper, we propose a novel approach based on combining neuro-fuzzy network models with biological knowledge to infer strong regulators and interrelated fuzzy rules. The hybrid neuro-fuzzy architecture can not only infer the fuzzy rules, which are suitable for describing the regulatory conditions in regulatory networks, but also explain the meaning of nodes and weight value in the neural network. It can get useful rules automatically without factitious judgments. At the same time, it does not add recursive layers to the model, and the model can also strengthen the relationships among genes and reduce calculation. We use the proposed approach to reconstruct a partial gene regulatory network of yeast. The results show that this approach can work effectively.

  12. GeneNetwork: framework for web-based genetics

    NARCIS (Netherlands)

    Sloan, Zachary; Arends, Danny; Broman, Karl W.; Centeno, Arthur; Furlotte, Nicholas; Nijveen, H.; Yan, Lei; Zhou, Xiang; Williams, Robert W.; Prins, Pjotr

    2016-01-01

    GeneNetwork (GN) is a free and open source (FOSS) framework for web-based genetics that can be deployed anywhere. GN allows biologists to upload high-throughput experimental data, such as expression data from microarrays and RNA-seq, and also `classic' phenotypes, such as disease phenotypes. These p

  13. A gene network engineering platform for lactic acid bacteria.

    Science.gov (United States)

    Kong, Wentao; Kapuganti, Venkata S; Lu, Ting

    2016-02-29

    Recent developments in synthetic biology have positioned lactic acid bacteria (LAB) as a major class of cellular chassis for applications. To achieve the full potential of LAB, one fundamental prerequisite is the capacity for rapid engineering of complex gene networks, such as natural biosynthetic pathways and multicomponent synthetic circuits, into which cellular functions are encoded. Here, we present a synthetic biology platform for rapid construction and optimization of large-scale gene networks in LAB. The platform involves a copy-controlled shuttle for hosting target networks and two associated strategies that enable efficient genetic editing and phenotypic validation. By using a nisin biosynthesis pathway and its variants as examples, we demonstrated multiplex, continuous editing of small DNA parts, such as ribosome-binding sites, as well as efficient manipulation of large building blocks such as genes and operons. To showcase the platform, we applied it to expand the phenotypic diversity of the nisin pathway by quickly generating a library of 63 pathway variants. We further demonstrated its utility by altering the regulatory topology of the nisin pathway for constitutive bacteriocin biosynthesis. This work demonstrates the feasibility of rapid and advanced engineering of gene networks in LAB, fostering their applications in biomedicine and other areas.

  14. Network analysis of genes and their association with diseases.

    Science.gov (United States)

    Kontou, Panagiota I; Pavlopoulou, Athanasia; Dimou, Niki L; Pavlopoulos, Georgios A; Bagos, Pantelis G

    2016-09-15

    A plethora of network-based approaches within the Systems Biology universe have been applied, to date, to investigate the underlying molecular mechanisms of various human diseases. In the present study, we perform a bipartite, topological and clustering graph analysis in order to gain a better understanding of the relationships between human genetic diseases and the relationships between the genes that are implicated in them. For this purpose, disease-disease and gene-gene networks were constructed from combined gene-disease association networks. The latter, were created by collecting and integrating data from three diverse resources, each one with different content covering from rare monogenic disorders to common complex diseases. This data pluralism enabled us to uncover important associations between diseases with unrelated phenotypic manifestations but with common genetic origin. For our analysis, the topological attributes and the functional implications of the individual networks were taken into account and are shortly discussed. We believe that some observations of this study could advance our understanding regarding the etiology of a disease with distinct pathological manifestations, and simultaneously provide the springboard for the development of preventive and therapeutic strategies and its underlying genetic mechanisms.

  15. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato

    2016-08-25

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  16. Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.

    Science.gov (United States)

    Anitha, P; Anbarasu, Anand; Ramaiah, Sudha

    2014-05-01

    Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance.

  17. Construction of coffee transcriptome networks based on gene annotation semantics.

    Science.gov (United States)

    Castillo, Luis F; Galeano, Narmer; Isaza, Gustavo A; Gaitán, Alvaro

    2012-07-24

    Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST) and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF) using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  18. Ancient Egypt

    Science.gov (United States)

    Swamy, Ashwin Balegar

    This thesis involves development of an interactive GIS (Geographic Information System) based application, which gives information about the ancient history of Egypt. The astonishing architecture, the strange burial rituals and their civilization were some of the intriguing questions that motivated me towards developing this application. The application is a historical timeline starting from 3100 BC, leading up to 664 BC, focusing on the evolution of the Egyptian dynasties. The tool holds information regarding some of the famous monuments which were constructed during that era and also about the civilizations that co-existed. It also provides details about the religions followed by their kings. It also includes the languages spoken during those periods. The tool is developed using JAVA, a programing language and MOJO (Map Objects Java Objects) a product of ESRI (Environmental Science Research Institute) to create map objects, to provide geographic information. JAVA Swing is used for designing the user interface. HTML (Hyper Text Markup Language) pages are created to provide the user with more information related to the historic period. CSS (Cascade Style Sheets) and JAVA Scripts are used with HTML5 to achieve creative display of content. The tool is kept simple and easy for the user to interact with. The tool also includes pictures and videos for the user to get a feel of the historic period. The application is built to motivate people to know more about one of the prominent and ancient civilization of the Mediterranean world.

  19. A gene regulatory network armature for T-lymphocyte specification

    Energy Technology Data Exchange (ETDEWEB)

    Fung, Elizabeth-sharon [Los Alamos National Laboratory

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  20. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  1. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

    Directory of Open Access Journals (Sweden)

    Vipin Narang

    Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.

  2. Gene trees, species trees, and morphology converge on a similar phylogeny of living gars (Actinopterygii: Holostei: Lepisosteidae), an ancient clade of ray-finned fishes.

    Science.gov (United States)

    Wright, Jeremy J; David, Solomon R; Near, Thomas J

    2012-06-01

    Extant gars represent the remaining members of a formerly diverse assemblage of ancient ray-finned fishes and have been the subject of multiple phylogenetic analyses using morphological data. Here, we present the first hypothesis of phylogenetic relationships among living gar species based on molecular data, through the examination of gene tree heterogeneity and coalescent species tree analyses of a portion of one mitochondrial (COI) and seven nuclear (ENC1, myh6, plagl2, S7 ribosomal protein intron 1, sreb2, tbr1, and zic1) genes. Individual gene trees displayed varying degrees of resolution with regards to species-level relationships, and the gene trees inferred from COI and the S7 intron were the only two that were completely resolved. Coalescent species tree analyses of nuclear genes resulted in a well-resolved and strongly supported phylogenetic tree of living gar species, for which Bayesian posterior node support was further improved by the inclusion of the mitochondrial gene. Species-level relationships among gars inferred from our molecular data set were highly congruent with previously published morphological phylogenies, with the exception of the placement of two species, Lepisosteus osseus and L. platostomus. Re-examination of the character coding used by previous authors provided partial resolution of this topological discordance, resulting in broad concordance in the phylogenies inferred from individual genes, the coalescent species tree analysis, and morphology. The completely resolved phylogeny inferred from the molecular data set with strong Bayesian posterior support at all nodes provided insights into the potential for introgressive hybridization and patterns of allopatric speciation in the evolutionary history of living gars, as well as a solid foundation for future examinations of functional diversification and evolutionary stasis in a "living fossil" lineage.

  3. Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Liang Jinghang

    2012-08-01

    Full Text Available Abstract Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs. As a logical model, probabilistic Boolean networks (PBNs consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n or O(nN2n for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN. An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n, where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a

  4. Vitamin D and gene networks in human osteoblasts

    Directory of Open Access Journals (Sweden)

    Jeroen evan de Peppel

    2014-04-01

    Full Text Available Bone formation is indirectly influenced by 1,25-dihydroxyvitamin D3 (1,25D3 through the stimulation of calcium uptake in the intestine and re-absorption in the kidneys. Direct effects on osteoblasts and bone formation have also been established. The vitamin D receptor (VDR is expressed in osteoblasts and 1,25D3 modifies gene expression of various osteoblast differentiation and mineralization-related genes, such as alkaline phosphatase (ALPL, osteocalcin (BGLAP and osteopontin (SPP1. 1,25D3 is known to stimulate mineralization of human osteoblasts in vitro, and recently it was shown that 1,25D3 induces mineralization via effects in the period preceding mineralization during the pre-mineralization period. For a full understanding of the action of 1,25D3 in osteoblasts it is important to get an integrated network view of the 1,25D3-regulated genes during osteoblast differentiation and mineralization. The current data will be presented and discussed alluding to future studies to fully delineate the 1,25D3 action in osteoblast. Describing and understanding the vitamin D regulatory networks and identifying the dominant players in these networks may help develop novel (personalized vitamin D-based treatments. The following topics will be discussed in this overview: 1 Bone metabolism and osteoblasts, 2 Vitamin D, bone metabolism and osteoblast function, 3 Vitamin D induced transcriptional networks in the context of osteoblast differentiation and bone formation.

  5. How difficult is inference of mammalian causal gene regulatory networks?

    Science.gov (United States)

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  6. Identifying disease feature genes based on cellular localized gene functional modules and regulation networks

    Institute of Scientific and Technical Information of China (English)

    ZHANG Min; ZHU Jing; GUO Zheng; LI Xia; YANG Da; WANG Lei; RAO Shaoqi

    2006-01-01

    Identifying disease-relevant genes and functional modules, based on gene expression profiles and gene functional knowledge, is of high importance for studying disease mechanisms and subtyping disease phenotypes. Using gene categories of biological process and cellular component in Gene Ontology, we propose an approach to selecting functional modules enriched with differentially expressed genes, and identifying the feature functional modules of high disease discriminating abilities. Using the differentially expressed genes in each feature module as the feature genes, we reveal the relevance of the modules to the studied diseases. Using three datasets for prostate cancer, gastric cancer, and leukemia, we have demonstrated that the proposed modular approach is of high power in identifying functionally integrated feature gene subsets that are highly relevant to the disease mechanisms. Our analysis has also shown that the critical disease-relevant genes might be better recognized from the gene regulation network, which is constructed using the characterized functional modules, giving important clues to the concerted mechanisms of the modules responding to complex disease states. In addition, the proposed approach to selecting the disease-relevant genes by jointly considering the gene functional knowledge suggests a new way for precisely classifying disease samples with clear biological interpretations, which is critical for the clinical diagnosis and the elucidation of the pathogenic basis of complex diseases.

  7. Prioritization of Susceptibility Genes for Ectopic Pregnancy by Gene Network Analysis.

    Science.gov (United States)

    Liu, Ji-Long; Zhao, Miao

    2016-02-01

    Ectopic pregnancy is a very dangerous complication of pregnancy, affecting 1%-2% of all reported pregnancies. Due to ethical constraints on human biopsies and the lack of suitable animal models, there has been little success in identifying functionally important genes in the pathogenesis of ectopic pregnancy. In the present study, we developed a random walk-based computational method named TM-rank to prioritize ectopic pregnancy-related genes based on text mining data and gene network information. Using a defined threshold value, we identified five top-ranked genes: VEGFA (vascular endothelial growth factor A), IL8 (interleukin 8), IL6 (interleukin 6), ESR1 (estrogen receptor 1) and EGFR (epidermal growth factor receptor). These genes are promising candidate genes that can serve as useful diagnostic biomarkers and therapeutic targets. Our approach represents a novel strategy for prioritizing disease susceptibility genes.

  8. Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory

    Directory of Open Access Journals (Sweden)

    Gao Haichun

    2007-08-01

    Full Text Available Abstract Background Large-scale sequencing of entire genomes has ushered in a new age in biology. One of the next grand challenges is to dissect the cellular networks consisting of many individual functional modules. Defining co-expression networks without ambiguity based on genome-wide microarray data is difficult and current methods are not robust and consistent with different data sets. This is particularly problematic for little understood organisms since not much existing biological knowledge can be exploited for determining the threshold to differentiate true correlation from random noise. Random matrix theory (RMT, which has been widely and successfully used in physics, is a powerful approach to distinguish system-specific, non-random properties embedded in complex systems from random noise. Here, we have hypothesized that the universal predictions of RMT are also applicable to biological systems and the correlation threshold can be determined by characterizing the correlation matrix of microarray profiles using random matrix theory. Results Application of random matrix theory to microarray data of S. oneidensis, E. coli, yeast, A. thaliana, Drosophila, mouse and human indicates that there is a sharp transition of nearest neighbour spacing distribution (NNSD of correlation matrix after gradually removing certain elements insider the matrix. Testing on an in silico modular model has demonstrated that this transition can be used to determine the correlation threshold for revealing modular co-expression networks. The co-expression network derived from yeast cell cycling microarray data is supported by gene annotation. The topological properties of the resulting co-expression network agree well with the general properties of biological networks. Computational evaluations have showed that RMT approach is sensitive and robust. Furthermore, evaluation on sampled expression data of an in silico modular gene system has showed that under

  9. Toward an orofacial gene regulatory network.

    Science.gov (United States)

    Kousa, Youssef A; Schutte, Brian C

    2016-03-01

    Orofacial clefting is a common birth defect with significant morbidity. A panoply of candidate genes have been discovered through synergy of animal models and human genetics. Among these, variants in interferon regulatory factor 6 (IRF6) cause syndromic orofacial clefting and contribute risk toward isolated cleft lip and palate (1/700 live births). Rare variants in IRF6 can lead to Van der Woude syndrome (1/35,000 live births) and popliteal pterygium syndrome (1/300,000 live births). Furthermore, IRF6 regulates GRHL3 and rare variants in this downstream target can also lead to Van der Woude syndrome. In addition, a common variant (rs642961) in the IRF6 locus is found in 30% of the world's population and contributes risk for isolated orofacial clefting. Biochemical studies revealed that rs642961 abrogates one of four AP-2alpha binding sites. Like IRF6 and GRHL3, rare variants in TFAP2A can also lead to syndromic orofacial clefting with lip pits (branchio-oculo-facial syndrome). The literature suggests that AP-2alpha, IRF6 and GRHL3 are part of a pathway that is essential for lip and palate development. In addition to updating the pathways, players and pursuits, this review will highlight some of the current questions in the study of orofacial clefting.

  10. Evolution of the mammalian embryonic pluripotency gene regulatory network

    Science.gov (United States)

    Fernandez-Tresguerres, Beatriz; Cañon, Susana; Rayon, Teresa; Pernaute, Barbara; Crespo, Miguel; Torroja, Carlos; Manzanares, Miguel

    2010-01-01

    Embryonic pluripotency in the mouse is established and maintained by a gene-regulatory network under the control of a core set of transcription factors that include octamer-binding protein 4 (Oct4; official name POU domain, class 5, transcription factor 1, Pou5f1), sex-determining region Y (SRY)-box containing gene 2 (Sox2), and homeobox protein Nanog. Although this network is largely conserved in eutherian mammals, very little information is available regarding its evolutionary conservation in other vertebrates. We have compared the embryonic pluripotency networks in mouse and chick by means of expression analysis in the pregastrulation chicken embryo, genomic comparisons, and functional assays of pluripotency-related regulatory elements in ES cells and blastocysts. We find that multiple components of the network are either novel to mammals or have acquired novel expression domains in early developmental stages of the mouse. We also find that the downstream action of the mouse core pluripotency factors is mediated largely by genomic sequence elements nonconserved with chick. In the case of Sox2 and Fgf4, we find that elements driving expression in embryonic pluripotent cells have evolved by a small number of nucleotide changes that create novel binding sites for core factors. Our results show that the network in charge of embryonic pluripotency is an evolutionary novelty of mammals that is related to the comparatively extended period during which mammalian embryonic cells need to be maintained in an undetermined state before engaging in early differentiation events. PMID:21048080

  11. Regulation of flowering in rice: two florigen genes, a complex gene network, and natural variation.

    Science.gov (United States)

    Tsuji, Hiroyuki; Taoka, Ken-ichiro; Shimamoto, Ko

    2011-02-01

    Photoperiodic control of flowering time consists of a complicated network that converges into the generation of a mobile flowering signal called florigen. Recent advances identifying the protein FT/Hd3a as the molecular nature responsible for florigen activity have focused current research on florigen genes as the important output of this complex signaling network. Rice is a model system for short-day plants and recent progress in elucidating the flowering network from rice and Arabidopsis, a long-day plant, provides an evolutionarily comparative view of the photoperiodic flowering pathway. This review summarizes photoperiodic flowering control in rice, including the interaction of complex layers of gene networks contributed from evolutionarily unique factors and the regulatory adaptation of conserved factors.

  12. Genes responsive to elevated CO2 concentrations in triploid white poplar and integrated gene network analysis.

    Directory of Open Access Journals (Sweden)

    Juanjuan Liu

    Full Text Available BACKGROUND: The atmospheric CO2 concentration increases every year. While the effects of elevated CO2 on plant growth, physiology and metabolism have been studied, there is now a pressing need to understand the molecular mechanisms of how plants will respond to future increases in CO2 concentration using genomic techniques. PRINCIPAL FINDINGS: Gene expression in triploid white poplar ((Populus tomentosa ×P. bolleana ×P. tomentosa leaves was investigated using the Affymetrix poplar genome gene chip, after three months of growth in controlled environment chambers under three CO2 concentrations. Our physiological findings showed the growth, assessed as stem diameter, was significantly increased, and the net photosynthetic rate was decreased in elevated CO2 concentrations. The concentrations of four major endogenous hormones appeared to actively promote plant development. Leaf tissues under elevated CO2 concentrations had 5,127 genes with different expression patterns in comparison to leaves under the ambient CO2 concentration. Among these, 8 genes were finally selected for further investigation by using randomized variance model corrective ANOVA analysis, dynamic gene expression profiling, gene network construction, and quantitative real-time PCR validation. Among the 8 genes in the network, aldehyde dehydrogenase and pyruvate kinase were situated in the core and had interconnections with other genes. CONCLUSIONS: Under elevated CO2 concentrations, 8 significantly changed key genes involved in metabolism and responding to stimulus of external environment were identified. These genes play crucial roles in the signal transduction network and show strong correlations with elevated CO2 exposure. This study provides several target genes, further investigation of which could provide an initial step for better understanding the molecular mechanisms of plant acclimation and evolution in future rising CO2 concentrations.

  13. The use of gene interaction networks to improve the identification of cancer driver genes

    Directory of Open Access Journals (Sweden)

    Emilie Ramsahai

    2017-01-01

    Full Text Available Bioinformaticians have implemented different strategies to distinguish cancer driver genes from passenger genes. One of the more recent advances uses a pathway-oriented approach. Methods that employ this strategy are highly dependent on the quality and size of the pathway interaction network employed, and require a powerful statistical environment for analyses. A number of genomic libraries are available in R. DriverNet and DawnRank employ pathway-based methods that use gene interaction graphs in matrix form. We investigated the benefit of combining data from 3 different sources on the prediction outcome of cancer driver genes by DriverNet and DawnRank. An enriched dataset was derived comprising 13,862 genes with 372,250 interactions, which increased its accuracy by 17% and 28%, respectively, compared to their original networks. The study identified 33 new candidate driver genes. Our study highlights the potential of combining networks and weighting edges to provide greater accuracy in the identification of cancer driver genes.

  14. Modifier genes and the plasticity of genetic networks in mice.

    Directory of Open Access Journals (Sweden)

    Bruce A Hamilton

    Full Text Available Modifier genes are an integral part of the genetic landscape in both humans and experimental organisms, but have been less well explored in mammals than other systems. A growing number of modifier genes in mouse models of disease nonetheless illustrate the potential for novel findings, while new technical advances promise many more to come. Modifier genes in mouse models include induced mutations and spontaneous or wild-derived variations captured in inbred strains. Identification of modifiers among wild-derived variants in particular should detect disease modifiers that have been shaped by selection and might therefore be compatible with high fitness and function. Here we review selected examples and argue that modifier genes derived from natural variation may provide a bias for nodes in genetic networks that have greater intrinsic plasticity and whose therapeutic manipulation may therefore be more resilient to side effects than conventional targets.

  15. Optimal Control of Gene Regulatory Networks with Effectiveness of Multiple Drugs: A Boolean Network Approach

    Science.gov (United States)

    Kobayashi, Koichi; Hiraishi, Kunihiko

    2013-01-01

    Developing control theory of gene regulatory networks is one of the significant topics in the field of systems biology, and it is expected to apply the obtained results to gene therapy technologies in the future. In this paper, a control method using a Boolean network (BN) is studied. A BN is widely used as a model of gene regulatory networks, and gene expression is expressed by a binary value (0 or 1). In the control problem, we assume that the concentration level of a part of genes is arbitrarily determined as the control input. However, there are cases that no gene satisfying this assumption exists, and it is important to consider structural control via external stimuli. Furthermore, these controls are realized by multiple drugs, and it is also important to consider multiple effects such as duration of effect and side effects. In this paper, we propose a BN model with two types of the control inputs and an optimal control method with duration of drug effectiveness. First, a BN model and duration of drug effectiveness are discussed. Next, the optimal control problem is formulated and is reduced to an integer linear programming problem. Finally, numerical simulations are shown. PMID:24058904

  16. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  17. Ethanol modulation of gene networks: implications for alcoholism.

    Science.gov (United States)

    Farris, Sean P; Miles, Michael F

    2012-01-01

    Alcoholism is a complex disease caused by a confluence of environmental and genetic factors influencing multiple brain pathways to produce a variety of behavioral sequelae, including addiction. Genetic factors contribute to over 50% of the risk for alcoholism and recent evidence points to a large number of genes with small effect sizes as the likely molecular basis for this disease. Recent progress in genomics (microarrays or RNA-Seq) and genetics has led to the identification of a large number of potential candidate genes influencing ethanol behaviors or alcoholism itself. To organize this complex information, investigators have begun to focus on the contribution of gene networks, rather than individual genes, for various ethanol-induced behaviors in animal models or behavioral endophenotypes comprising alcoholism. This chapter reviews some of the methods used for constructing gene networks from genomic data and some of the recent progress made in applying such approaches to the study of the neurobiology of ethanol. We show that rapid technology development in gathering genomic data, together with sophisticated experimental design and a growing collection of analysis tools are producing novel insights for understanding the molecular basis of alcoholism and that such approaches promise new opportunities for therapeutic development.

  18. Comparison of evolutionary algorithms in gene regulatory network model inference.

    LENUS (Irish Health Repository)

    2010-01-01

    ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.

  19. Optimal Constrained Stationary Intervention in Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Golnaz Vahedi

    2008-05-01

    Full Text Available A key objective of gene network modeling is to develop intervention strategies to alter regulatory dynamics in such a way as to reduce the likelihood of undesirable phenotypes. Optimal stationary intervention policies have been developed for gene regulation in the framework of probabilistic Boolean networks in a number of settings. To mitigate the possibility of detrimental side effects, for instance, in the treatment of cancer, it may be desirable to limit the expected number of treatments beneath some bound. This paper formulates a general constraint approach for optimal therapeutic intervention by suitably adapting the reward function and then applies this formulation to bound the expected number of treatments. A mutated mammalian cell cycle is considered as a case study.

  20. Optimal Constrained Stationary Intervention in Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Faryabi Babak

    2008-01-01

    Full Text Available A key objective of gene network modeling is to develop intervention strategies to alter regulatory dynamics in such a way as to reduce the likelihood of undesirable phenotypes. Optimal stationary intervention policies have been developed for gene regulation in the framework of probabilistic Boolean networks in a number of settings. To mitigate the possibility of detrimental side effects, for instance, in the treatment of cancer, it may be desirable to limit the expected number of treatments beneath some bound. This paper formulates a general constraint approach for optimal therapeutic intervention by suitably adapting the reward function and then applies this formulation to bound the expected number of treatments. A mutated mammalian cell cycle is considered as a case study.

  1. Noise Control in Gene Regulatory Networks with Negative Feedback.

    Science.gov (United States)

    Hinczewski, Michael; Thirumalai, D

    2016-07-01

    Genes and proteins regulate cellular functions through complex circuits of biochemical reactions. Fluctuations in the components of these regulatory networks result in noise that invariably corrupts the signal, possibly compromising function. Here, we create a practical formalism based on ideas introduced by Wiener and Kolmogorov (WK) for filtering noise in engineered communications systems to quantitatively assess the extent to which noise can be controlled in biological processes involving negative feedback. Application of the theory, which reproduces the previously proven scaling of the lower bound for noise suppression in terms of the number of signaling events, shows that a tetracycline repressor-based negative-regulatory gene circuit behaves as a WK filter. For the class of Hill-like nonlinear regulatory functions, this type of filter provides the optimal reduction in noise. Our theoretical approach can be readily combined with experimental measurements of response functions in a wide variety of genetic circuits, to elucidate the general principles by which biological networks minimize noise.

  2. Gene network interconnectedness and the generalized topological overlap measure

    Directory of Open Access Journals (Sweden)

    Horvath Steve

    2007-01-01

    Full Text Available Abstract Background Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. 1 can be interpreted as a measure of agreement between the m = 1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness. Results We generalize the topological overlap measure from m = 1 step neighborhoods to m ≥ 2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM by (i counting the number of m-step neighbors that a pair of nodes share and (ii normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis. Conclusion Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules.

  3. Prioritisation and network analysis of Crohn's disease susceptibility genes.

    Directory of Open Access Journals (Sweden)

    Daniele Muraro

    Full Text Available Recent Genome-Wide Association Studies (GWAS have revealed numerous Crohn's disease susceptibility genes and a key challenge now is in understanding how risk polymorphisms in associated genes might contribute to development of this disease. For a gene to contribute to disease phenotype, its risk variant will likely adversely communicate with a variety of other gene products to result in dysregulation of common signaling pathways. A vital challenge is to elucidate pathways of potentially greatest influence on pathological behaviour, in a manner recognizing how multiple relevant genes may yield integrative effect. In this work we apply mathematical analysis of networks involving the list of recently described Crohn's susceptibility genes, to prioritise pathways in relation to their potential development of this disease. Prioritisation was performed by applying a text mining and a diffusion based method (GRAIL, GPEC. Prospective biological significance of the resulting prioritised list of proteins is highlighted by changes in their gene expression levels in Crohn's patients intestinal tissue in comparison with healthy donors.

  4. Dose response relationship in anti-stress gene regulatory networks.

    OpenAIRE

    Qiang Zhang; Andersen, Melvin E.

    2007-01-01

    To maintain a stable intracellular environment, cells utilize complex and specialized defense systems against a variety of external perturbations, such as electrophilic stress, heat shock, and hypoxia, etc. Irrespective of the type of stress, many adaptive mechanisms contributing to cellular homeostasis appear to operate through gene regulatory networks that are organized into negative feedback loops. In general, the degree of deviation of the controlled variables, such as electrophiles, misf...

  5. Topological effects of data incompleteness of gene regulatory networks

    CERN Document Server

    Sanz, J; Borge-Holthoefer, J; Moreno, Y

    2012-01-01

    The topological analysis of biological networks has been a prolific topic in network science during the last decade. A persistent problem with this approach is the inherent uncertainty and noisy nature of the data. One of the cases in which this situation is more marked is that of transcriptional regulatory networks (TRNs) in bacteria. The datasets are incomplete because regulatory pathways associated to a relevant fraction of bacterial genes remain unknown. Furthermore, direction, strengths and signs of the links are sometimes unknown or simply overlooked. Finally, the experimental approaches to infer the regulations are highly heterogeneous, in a way that induces the appearance of systematic experimental-topological correlations. And yet, the quality of the available data increases constantly. In this work we capitalize on these advances to point out the influence of data (in)completeness and quality on some classical results on topological analysis of TRNs, specially regarding modularity at different level...

  6. Phase transitions in the evolution of gene regulatory networks

    Science.gov (United States)

    Skanata, Antun; Kussell, Edo

    The role of gene regulatory networks is to respond to environmental conditions and optimize growth of the cell. A typical example is found in bacteria, where metabolic genes are activated in response to nutrient availability, and are subsequently turned off to conserve energy when their specific substrates are depleted. However, in fluctuating environmental conditions, regulatory networks could experience strong evolutionary pressures not only to turn the right genes on and off, but also to respond optimally under a wide spectrum of fluctuation timescales. The outcome of evolution is predicted by the long-term growth rate, which differentiates between optimal strategies. Here we present an analytic computation of the long-term growth rate in randomly fluctuating environments, by using mean-field and higher order expansion in the environmental history. We find that optimal strategies correspond to distinct regions in the phase space of fluctuations, separated by first and second order phase transitions. The statistics of environmental randomness are shown to dictate the possible evolutionary modes, which either change the structure of the regulatory network abruptly, or gradually modify and tune the interactions between its components.

  7. Identification of the VERNALIZATION 4 gene reveals the origin of spring growth habit in ancient wheats from South Asia

    Science.gov (United States)

    Wheat varieties with a winter growth habit require long exposures to low temperatures (vernalization) to accelerate flowering. Natural variation in the vernalization genes regulating this requirement has favored wheat adaptation to different environments. The main wheat vernalization genes VRN1, V...

  8. Gene network analysis in a pediatric cohort identifies novel lung function genes.

    Directory of Open Access Journals (Sweden)

    Bruce A Ong

    Full Text Available Lung function is a heritable trait and serves as an important clinical predictor of morbidity and mortality for pulmonary conditions in adults, however, despite its importance, no studies have focused on uncovering pediatric-specific loci influencing lung function. To identify novel genetic determinants of pediatric lung function, we conducted a genome-wide association study (GWAS of four pulmonary function traits, including FVC, FEV1, FEV1/FVC and FEF25-75% in 1556 children. Further, we carried out gene network analyses for each trait including all SNPs with a P-value of <1.0 × 10(-3 from the individual GWAS. The GWAS identified SNPs with notable trends towards association with the pulmonary function measures, including the previously described INTS12 locus association with FEV1 (pmeta=1.41 × 10(-7. The gene network analyses identified 34 networks of genes associated with pulmonary function variables in Caucasians. Of those, the glycoprotein gene network reached genome-wide significance for all four variables. P-value range pmeta=6.29 × 10(-4 - 2.80 × 10(-8 on meta-analysis. In this study, we report on specific pathways that are significantly associated with pediatric lung function at genome-wide significance. In addition, we report the first loci associated with lung function in both pediatric Caucasian and African American populations.

  9. Phylogenetic analysis of bacterial and archaeal arsC gene sequences suggests an ancient, common origin for arsenate reductase

    Directory of Open Access Journals (Sweden)

    Dugas Sandra L

    2003-07-01

    Full Text Available Abstract Background The ars gene system provides arsenic resistance for a variety of microorganisms and can be chromosomal or plasmid-borne. The arsC gene, which codes for an arsenate reductase is essential for arsenate resistance and transforms arsenate into arsenite, which is extruded from the cell. A survey of GenBank shows that arsC appears to be phylogenetically widespread both in organisms with known arsenic resistance and those organisms that have been sequenced as part of whole genome projects. Results Phylogenetic analysis of aligned arsC sequences shows broad similarities to the established 16S rRNA phylogeny, with separation of bacterial, archaeal, and subsequently eukaryotic arsC genes. However, inconsistencies between arsC and 16S rRNA are apparent for some taxa. Cyanobacteria and some of the γ-Proteobacteria appear to possess arsC genes that are similar to those of Low GC Gram-positive Bacteria, and other isolated taxa possess arsC genes that would not be expected based on known evolutionary relationships. There is no clear separation of plasmid-borne and chromosomal arsC genes, although a number of the Enterobacteriales (γ-Proteobacteria possess similar plasmid-encoded arsC sequences. Conclusion The overall phylogeny of the arsenate reductases suggests a single, early origin of the arsC gene and subsequent sequence divergence to give the distinct arsC classes that exist today. Discrepancies between 16S rRNA and arsC phylogenies support the role of horizontal gene transfer (HGT in the evolution of arsenate reductases, with a number of instances of HGT early in bacterial arsC evolution. Plasmid-borne arsC genes are not monophyletic suggesting multiple cases of chromosomal-plasmid exchange and subsequent HGT. Overall, arsC phylogeny is complex and is likely the result of a number of evolutionary mechanisms.

  10. Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome.

    Science.gov (United States)

    Opazo, Juan C; Lee, Alison P; Hoffmann, Federico G; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F

    2015-07-01

    Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about

  11. Network Security via Biometric Recognition of Patterns of Gene Expression

    Science.gov (United States)

    Shaw, Harry C.

    2016-01-01

    Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time expression and assay of gene expression products.

  12. Topology association analysis in weighted protein interaction network for gene prioritization

    Science.gov (United States)

    Wu, Shunyao; Shao, Fengjing; Zhang, Qi; Ji, Jun; Xu, Shaojie; Sun, Rencheng; Sun, Gengxin; Du, Xiangjun; Sui, Yi

    2016-11-01

    Although lots of algorithms for disease gene prediction have been proposed, the weights of edges are rarely taken into account. In this paper, the strengths of topology associations between disease and essential genes are analyzed in weighted protein interaction network. Empirical analysis demonstrates that compared to other genes, disease genes are weakly connected with essential genes in protein interaction network. Based on this finding, a novel global distance measurement for gene prioritization with weighted protein interaction network is proposed in this paper. Positive and negative flow is allocated to disease and essential genes, respectively. Additionally network propagation model is extended for weighted network. Experimental results on 110 diseases verify the effectiveness and potential of the proposed measurement. Moreover, weak links play more important role than strong links for gene prioritization, which is meaningful to deeply understand protein interaction network.

  13. Construction of Gene Regulatory Networks Using Recurrent Neural Networks and Swarm Intelligence.

    Science.gov (United States)

    Khan, Abhinandan; Mandal, Sudip; Pal, Rajat Kumar; Saha, Goutam

    2016-01-01

    We have proposed a methodology for the reverse engineering of biologically plausible gene regulatory networks from temporal genetic expression data. We have used established information and the fundamental mathematical theory for this purpose. We have employed the Recurrent Neural Network formalism to extract the underlying dynamics present in the time series expression data accurately. We have introduced a new hybrid swarm intelligence framework for the accurate training of the model parameters. The proposed methodology has been first applied to a small artificial network, and the results obtained suggest that it can produce the best results available in the contemporary literature, to the best of our knowledge. Subsequently, we have implemented our proposed framework on experimental (in vivo) datasets. Finally, we have investigated two medium sized genetic networks (in silico) extracted from GeneNetWeaver, to understand how the proposed algorithm scales up with network size. Additionally, we have implemented our proposed algorithm with half the number of time points. The results indicate that a reduction of 50% in the number of time points does not have an effect on the accuracy of the proposed methodology significantly, with a maximum of just over 15% deterioration in the worst case.

  14. Locus heterogeneity disease genes encode proteins with high interconnectivity in the human protein interaction network.

    Science.gov (United States)

    Keith, Benjamin P; Robertson, David L; Hentges, Kathryn E

    2014-01-01

    Mutations in genes potentially lead to a number of genetic diseases with differing severity. These disease genes have been the focus of research in recent years showing that the disease gene population as a whole is not homogeneous, and can be categorized according to their interactions. Locus heterogeneity describes a single disorder caused by mutations in different genes each acting individually to cause the same disease. Using datasets of experimentally derived human disease genes and protein interactions, we created a protein interaction network to investigate the relationships between the products of genes associated with a disease displaying locus heterogeneity, and use network parameters to suggest properties that distinguish these disease genes from the overall disease gene population. Through the manual curation of known causative genes of 100 diseases displaying locus heterogeneity and 397 single-gene Mendelian disorders, we use network parameters to show that our locus heterogeneity network displays distinct properties from the global disease network and a Mendelian network. Using the global human proteome, through random simulation of the network we show that heterogeneous genes display significant interconnectivity. Further topological analysis of this network revealed clustering of locus heterogeneity genes that cause identical disorders, indicating that these disease genes are involved in similar biological processes. We then use this information to suggest additional genes that may contribute to diseases with locus heterogeneity.

  15. Evolution of a core gene network for skeletogenesis in chordates.

    Directory of Open Access Journals (Sweden)

    Jochen Hecht

    2008-03-01

    Full Text Available The skeleton is one of the most important features for the reconstruction of vertebrate phylogeny but few data are available to understand its molecular origin. In mammals the Runt genes are central regulators of skeletogenesis. Runx2 was shown to be essential for osteoblast differentiation, tooth development, and bone formation. Both Runx2 and Runx3 are essential for chondrocyte maturation. Furthermore, Runx2 directly regulates Indian hedgehog expression, a master coordinator of skeletal development. To clarify the correlation of Runt gene evolution and the emergence of cartilage and bone in vertebrates, we cloned the Runt genes from hagfish as representative of jawless fish (MgRunxA, MgRunxB and from dogfish as representative of jawed cartilaginous fish (ScRunx1-3. According to our phylogenetic reconstruction the stem species of chordates harboured a single Runt gene and thereafter Runt locus duplications occurred during early vertebrate evolution. All newly isolated Runt genes were expressed in cartilage according to quantitative PCR. In situ hybridisation confirmed high MgRunxA expression in hard cartilage of hagfish. In dogfish ScRunx2 and ScRunx3 were expressed in embryonal cartilage whereas all three Runt genes were detected in teeth and placoid scales. In cephalochordates (lancelets Runt, Hedgehog and SoxE were strongly expressed in the gill bars and expression of Runt and Hedgehog was found in endo- as well as ectodermal cells. Furthermore we demonstrate that the lancelet Runt protein binds to Runt binding sites in the lancelet Hedgehog promoter and regulates its activity. Together, these results suggest that Runt and Hedgehog were part of a core gene network for cartilage formation, which was already active in the gill bars of the common ancestor of cephalochordates and vertebrates and diversified after Runt duplications had occurred during vertebrate evolution. The similarities in expression patterns of Runt genes support the view

  16. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  17. An ancient history of gene duplications, fusions and losses in the evolution of APOBEC3 mutators in mammals

    Directory of Open Access Journals (Sweden)

    Münk Carsten

    2012-05-01

    Full Text Available Abstract Background The APOBEC3 (A3 genes play a key role in innate antiviral defense in mammals by introducing directed mutations in the DNA. The human genome encodes for seven A3 genes, with multiple splice alternatives. Different A3 proteins display different substrate specificity, but the very basic question on how discerning self from non-self still remains unresolved. Further, the expression of A3 activity/ies shapes the way both viral and host genomes evolve. Results We present here a detailed temporal analysis of the origin and expansion of the A3 repertoire in mammals. Our data support an evolutionary scenario where the genome of the mammalian ancestor encoded for at least one ancestral A3 gene, and where the genome of the ancestor of placental mammals (and possibly of the ancestor of all mammals already encoded for an A3Z1-A3Z2-A3Z3 arrangement. Duplication events of the A3 genes have occurred independently in different lineages: humans, cats and horses. In all of them, gene duplication has resulted in changes in enzyme activity and/or substrate specificity, in a paradigmatic example of convergent adaptive evolution at the genomic level. Finally, our results show that evolutionary rates for the three A3Z1, A3Z2 and A3Z3 motifs have significantly decreased in the last 100 Mya. The analysis constitutes a textbook example of the evolution of a gene locus by duplication and sub/neofunctionalization in the context of virus-host arms race. Conclusions Our results provide a time framework for identifying ancestral and derived genomic arrangements in the APOBEC loci, and to date the expansion of this gene family for different lineages through time, as a response to changes in viral/retroviral/retrotransposon pressure.

  18. Ancient Origin of the U2 Small Nuclear RNA Gene-Targeting Non-LTR Retrotransposons Utopia.

    Science.gov (United States)

    Kojima, Kenji K; Jurka, Jerzy

    2015-01-01

    Most non-long terminal repeat (non-LTR) retrotransposons encoding a restriction-like endonuclease show target-specific integration into repetitive sequences such as ribosomal RNA genes and microsatellites. However, only a few target-specific lineages of non-LTR retrotransposons are distributed widely and no lineage is found across the eukaryotic kingdoms. Here we report the most widely distributed lineage of target sequence-specific non-LTR retrotransposons, designated Utopia. Utopia is found in three supergroups of eukaryotes: Amoebozoa, SAR, and Opisthokonta. Utopia is inserted into a specific site of U2 small nuclear RNA genes with different strength of specificity for each family. Utopia families from oomycetes and wasps show strong target specificity while only a small number of Utopia copies from reptiles are flanked with U2 snRNA genes. Oomycete Utopia families contain an "archaeal" RNase H domain upstream of reverse transcriptase (RT), which likely originated from a plant RNase H gene. Analysis of Utopia from oomycetes indicates that multiple lineages of Utopia have been maintained inside of U2 genes with few copy numbers. Phylogenetic analysis of RT suggests the monophyly of Utopia, and it likely dates back to the early evolution of eukaryotes.

  19. Cerebellar network plasticity: from genes to fast oscillation.

    Science.gov (United States)

    Cheron, G; Servais, L; Dan, B

    2008-04-22

    The role of the cerebellum has been increasingly recognized not only in motor control but in sensory, cognitive and emotional learning and regulation. Purkinje cells, being the sole output from the cerebellar cortex, occupy an integrative position in this network. Plasticity at this level is known to critically involve calcium signaling. In the last few years, electrophysiological study of genetically engineered mice has demonstrated the topical role of several genes encoding calcium-binding proteins (calretinin, calbindin, parvalbumin). Specific inactivation of these genes results in the emergence of a fast network oscillation (ca. 160 Hz) throughout the cerebellar cortex in alert animals, associated with ataxia. This oscillation is produced by synchronization of Purkinje cells along the parallel fiber beam. It behaves as an electrophysiological arrest rhythm, being blocked by sensorimotor stimulation. Pharmacological manipulations showed that the oscillation is blocked by GABA(A) and NMDA antagonists as well as gap junction blockers. This cerebellar network oscillation has also been documented in mouse models of human conditions with complex developmental cerebellar dysfunction, such as Angelman syndrome and fetal alcohol syndrome. Recent evidence suggests a relationship between fast oscillation and cerebellar long term depression (LTD). This may have major implications for future therapeutic targeting.

  20. [Epigenetics: gene and epigene networks in ontogeny and phylogeny].

    Science.gov (United States)

    Churaev, R N

    2006-09-01

    An attempt was made to systematize theoretical and experimental epigenetic data in the framework of genetics as a science on laws of preservation, coding, transfer, and transformation of heritable information in the living systems. The structure of the total hereditary memory is discussed in context of the theory of epigenes, hereditary units of the next to genes level of complexity. In epigenes as cells of functional hereditary memory, part of the hereditary information is stored, coded, and transmitted to the progeny irrespective of the primary structure of the genomic DNA molecules. The principles of the structure and the general laws of functioning of cellular governing gene networks are presented. The ontogenetic and phylogenetic role of epigene networks as the second level of the hereditary system is considered. Arguments for inheritance of somatic epimutations are presented, as well as the results of in silico and in vivo experiments showing the possibility of an epigenetic mechanism of primary biochemical divergent determination (autodetermination). A network hypothesis on material carriers of the common heterotary memory is formulated.

  1. DREISS: Using State-Space Models to Infer the Dynamics of Gene Expression Driven by External and Internal Regulatory Networks

    Science.gov (United States)

    Gerstein, Mark

    2016-01-01

    Gene expression is controlled by the combinatorial effects of regulatory factors from different biological subsystems such as general transcription factors (TFs), cellular growth factors and microRNAs. A subsystem’s gene expression may be controlled by its internal regulatory factors, exclusively, or by external subsystems, or by both. It is thus useful to distinguish the degree to which a subsystem is regulated internally or externally–e.g., how non-conserved, species-specific TFs affect the expression of conserved, cross-species genes during evolution. We developed a computational method (DREISS, dreiss.gerteinlab.org) for analyzing the Dynamics of gene expression driven by Regulatory networks, both External and Internal based on State Space models. Given a subsystem, the “state” and “control” in the model refer to its own (internal) and another subsystem’s (external) gene expression levels. The state at a given time is determined by the state and control at a previous time. Because typical time-series data do not have enough samples to fully estimate the model’s parameters, DREISS uses dimensionality reduction, and identifies canonical temporal expression trajectories (e.g., degradation, growth and oscillation) representing the regulatory effects emanating from various subsystems. To demonstrate capabilities of DREISS, we study the regulatory effects of evolutionarily conserved vs. divergent TFs across distant species. In particular, we applied DREISS to the time-series gene expression datasets of C. elegans and D. melanogaster during their embryonic development. We analyzed the expression dynamics of the conserved, orthologous genes (orthologs), seeing the degree to which these can be accounted for by orthologous (internal) versus species-specific (external) TFs. We found that between two species, the orthologs have matched, internally driven expression patterns but very different externally driven ones. This is particularly true for genes with

  2. Gene Networks in Plant Ozone Stress Response and Tolerance

    Institute of Scientific and Technical Information of China (English)

    Agnieszka Ludwikow; Jan Sadowski

    2008-01-01

    For many plant species ozone stress has become much more severe in the last decade. The accumulating evidence for the significant effects of ozone pollutant on crop and forest yield situate ozone as one of the most important environmental stress factors that limits plant productivity woddwide. Today, transcdptomic approaches seem to give the best coverage of genome level responses. Therefore, microarray serves as an invaluable tool for global gene expression analyses, unravelling new information about gene pathways, in-species and crose-species gene expression comparison, and for the characterization of unknown relationships between genes. In this review we summadze the recent progress in the transcdptomics of ozone to demonstrate the benefits that can be harvested from the application of integrative and systematic analytical approaches to study ozone stress response. We focused our consideration on microarray analyses identifying gene networks responsible for response and tolerance to elevated ozone concentration. From these analyses it is now possible to notice how plant ozone defense responses depend on the interplay between many complex signaling pathways and metabolite signals.

  3. Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Alina Sîrbu

    2015-05-01

    Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.

  4. Autonomous Boolean modelling of developmental gene regulatory networks

    Science.gov (United States)

    Cheng, Xianrui; Sun, Mengyang; Socolar, Joshua E. S.

    2013-01-01

    During early embryonic development, a network of regulatory interactions among genes dynamically determines a pattern of differentiated tissues. We show that important timing information associated with the interactions can be faithfully represented in autonomous Boolean models in which binary variables representing expression levels are updated in continuous time, and that such models can provide a direct insight into features that are difficult to extract from ordinary differential equation (ODE) models. As an application, we model the experimentally well-studied network controlling fly body segmentation. The Boolean model successfully generates the patterns formed in normal and genetically perturbed fly embryos, permits the derivation of constraints on the time delay parameters, clarifies the logic associated with different ODE parameter sets and provides a platform for studying connectivity and robustness in parameter space. By elucidating the role of regulatory time delays in pattern formation, the results suggest new types of experimental measurements in early embryonic development. PMID:23034351

  5. A Network Partition Algorithm for Mining Gene Functional Modules of Colon Cancer from DNA Microarray Data

    Institute of Scientific and Technical Information of China (English)

    Xiao-Gang Ruan; Jin-Lian Wang; Jian-Geng Li

    2006-01-01

    Computational analysis is essential for transforming the masses of microarray data into a mechanistic understanding of cancer. Here we present a method for finding gene functional modules of cancer from microarray data and have applied it to colon cancer. First, a colon cancer gene network and a normal colon tissue gene network were constructed using correlations between the genes. Then the modules that tended to have a homogeneous functional composition were identified by splitting up the network. Analysis of both networks revealed that they are scale-free.Comparison of the gene functional modules for colon cancer and normal tissues showed that the modules' functions changed with their structures.

  6. Biphasic Hoxd gene expression in shark paired fins reveals an ancient origin of the distal limb domain.

    Directory of Open Access Journals (Sweden)

    Renata Freitas

    Full Text Available The evolutionary transition of fins to limbs involved development of a new suite of distal skeletal structures, the digits. During tetrapod limb development, genes at the 5' end of the HoxD cluster are expressed in two spatiotemporally distinct phases. In the first phase, Hoxd9-13 are activated sequentially and form nested domains along the anteroposterior axis of the limb. This initial phase patterns the limb from its proximal limit to the middle of the forearm. Later in development, a second wave of transcription results in 5' HoxD gene expression along the distal end of the limb bud, which regulates formation of digits. Studies of zebrafish fins showed that the second phase of Hox expression does not occur, leading to the idea that the origin of digits was driven by addition of the distal Hox expression domain in the earliest tetrapods. Here we test this hypothesis by investigating Hoxd gene expression during paired fin development in the shark Scyliorhinus canicula, a member of the most basal lineage of jawed vertebrates. We report that at early stages, 5'Hoxd genes are expressed in anteroposteriorly nested patterns, consistent with the initial wave of Hoxd transcription in teleost and tetrapod paired appendages. Unexpectedly, a second phase of expression occurs at later stages of shark fin development, in which Hoxd12 and Hoxd13 are re-expressed along the distal margin of the fin buds. This second phase is similar to that observed in tetrapod limbs. The results indicate that a second, distal phase of Hoxd gene expression is not uniquely associated with tetrapod digit development, but is more likely a plesiomorphic condition present the common ancestor of chondrichthyans and osteichthyans. We propose that a temporal extension, rather than de novo activation, of Hoxd expression in the distal part of the fin may have led to the evolution of digits.

  7. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  8. A systematic molecular circuit design method for gene networks under biochemical time delays and molecular noises

    Directory of Open Access Journals (Sweden)

    Chang Yu-Te

    2008-11-01

    Full Text Available Abstract Background Gene networks in nanoscale are of nonlinear stochastic process. Time delays are common and substantial in these biochemical processes due to gene transcription, translation, posttranslation protein modification and diffusion. Molecular noises in gene networks come from intrinsic fluctuations, transmitted noise from upstream genes, and the global noise affecting all genes. Knowledge of molecular noise filtering and biochemical process delay compensation in gene networks is crucial to understand the signal processing in gene networks and the design of noise-tolerant and delay-robust gene circuits for synthetic biology. Results A nonlinear stochastic dynamic model with multiple time delays is proposed for describing a gene network under process delays, intrinsic molecular fluctuations, and extrinsic molecular noises. Then, the stochastic biochemical processing scheme of gene regulatory networks for attenuating these molecular noises and compensating process delays is investigated from the nonlinear signal processing perspective. In order to improve the robust stability for delay toleration and noise filtering, a robust gene circuit for nonlinear stochastic time-delay gene networks is engineered based on the nonlinear robust H∞ stochastic filtering scheme. Further, in order to avoid solving these complicated noise-tolerant and delay-robust design problems, based on Takagi-Sugeno (T-S fuzzy time-delay model and linear matrix inequalities (LMIs technique, a systematic gene circuit design method is proposed to simplify the design procedure. Conclusion The proposed gene circuit design method has much potential for application to systems biology, synthetic biology and drug design when a gene regulatory network has to be designed for improving its robust stability and filtering ability of disease-perturbed gene network or when a synthetic gene network needs to perform robustly under process delays and molecular noises.

  9. Evolution of Vertebrate Adam Genes; Duplication of Testicular Adams from Ancient Adam9/9-like Loci.

    Science.gov (United States)

    Bahudhanapati, Harinath; Bhattacharya, Shashwati; Wei, Shuo

    2015-01-01

    Members of the disintegrin metalloproteinase (ADAM) family have important functions in regulating cell-cell and cell-matrix interactions as well as cell signaling. There are two major types of ADAMs: the somatic ADAMs (sADAMs) that have a significant presence in somatic tissues, and the testicular ADAMs (tADAMs) that are expressed predominantly in the testis. Genes encoding tADAMs can be further divided into two groups: group I (intronless) and group II (intron-containing). To date, tAdams have only been reported in placental mammals, and their evolutionary origin and relationship to sAdams remain largely unknown. Using phylogenetic and syntenic tools, we analyzed the Adam genes in various vertebrates ranging from fishes to placental mammals. Our analyses reveal duplication and loss of some sAdams in certain vertebrate species. In particular, there exists an Adam9-like gene in non-mammalian vertebrates but not mammals. We also identified putative group I and group II tAdams in all amniote species that have been examined. These tAdam homologues are more closely related to Adams 9 and 9-like than to other sAdams. In all amniote species examined, group II tAdams lie in close vicinity to Adam9 and hence likely arose from tandem duplication, whereas group I tAdams likely originated through retroposition because of their lack of introns. Clusters of multiple group I tAdams are also common, suggesting tandem duplication after retroposition. Therefore, Adam9/9-like and some of the derived tAdam loci are likely preferred targets for tandem duplication and/or retroposition. Consistent with this hypothesis, we identified a young retroposed gene that duplicated recently from Adam9 in the opossum. As a result of gene duplication, some tAdams were pseudogenized in certain species, whereas others acquired new expression patterns and functions. The rapid duplication of Adam genes has a major contribution to the diversity of ADAMs in various vertebrate species.

  10. Relative stability of network states in Boolean network models of gene regulation in development.

    Science.gov (United States)

    Zhou, Joseph Xu; Samal, Areejit; d'Hérouël, Aymeric Fouquier; Price, Nathan D; Huang, Sui

    2016-01-01

    Progress in cell type reprogramming has revived the interest in Waddington's concept of the epigenetic landscape. Recently researchers developed the quasi-potential theory to represent the Waddington's landscape. The Quasi-potential U(x), derived from interactions in the gene regulatory network (GRN) of a cell, quantifies the relative stability of network states, which determine the effort required for state transitions in a multi-stable dynamical system. However, quasi-potential landscapes, originally developed for continuous systems, are not suitable for discrete-valued networks which are important tools to study complex systems. In this paper, we provide a framework to quantify the landscape for discrete Boolean networks (BNs). We apply our framework to study pancreas cell differentiation where an ensemble of BN models is considered based on the structure of a minimal GRN for pancreas development. We impose biologically motivated structural constraints (corresponding to specific type of Boolean functions) and dynamical constraints (corresponding to stable attractor states) to limit the space of BN models for pancreas development. In addition, we enforce a novel functional constraint corresponding to the relative ordering of attractor states in BN models to restrict the space of BN models to the biological relevant class. We find that BNs with canalyzing/sign-compatible Boolean functions best capture the dynamics of pancreas cell differentiation. This framework can also determine the genes' influence on cell state transitions, and thus can facilitate the rational design of cell reprogramming protocols.

  11. A Semiquantitative Framework for Gene Regulatory Networks: Increasing the Time and Quantitative Resolution of Boolean Networks

    Science.gov (United States)

    Kerkhofs, Johan; Geris, Liesbet

    2015-01-01

    Boolean models have been instrumental in predicting general features of gene networks and more recently also as explorative tools in specific biological applications. In this study we introduce a basic quantitative and a limited time resolution to a discrete (Boolean) framework. Quantitative resolution is improved through the employ of normalized variables in unison with an additive approach. Increased time resolution stems from the introduction of two distinct priority classes. Through the implementation of a previously published chondrocyte network and T helper cell network, we show that this addition of quantitative and time resolution broadens the scope of biological behaviour that can be captured by the models. Specifically, the quantitative resolution readily allows models to discern qualitative differences in dosage response to growth factors. The limited time resolution, in turn, can influence the reachability of attractors, delineating the likely long term system behaviour. Importantly, the information required for implementation of these features, such as the nature of an interaction, is typically obtainable from the literature. Nonetheless, a trade-off is always present between additional computational cost of this approach and the likelihood of extending the model’s scope. Indeed, in some cases the inclusion of these features does not yield additional insight. This framework, incorporating increased and readily available time and semi-quantitative resolution, can help in substantiating the litmus test of dynamics for gene networks, firstly by excluding unlikely dynamics and secondly by refining falsifiable predictions on qualitative behaviour. PMID:26067297

  12. The role of master regulators in gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Enrique Hernández Lemus

    2015-05-01

    Full Text Available Gene regulatory networks present a wide variety of dynamical responses to intrinsic and extrinsic perturbations. Arguably, one of the most important of such coordinated responses is the one of amplification cascades, in which activation of a few key-responsive transcription factors (termed master regulators, MRs lead to a large series of transcriptional activation events. This is so since master regulators are transcription factors controlling the expression of other transcription factor molecules and so on. MRs hold a central position related to transcriptional dynamics and control of gene regulatory networks and are often involved in complex feedback and feedforward loops inducing non-trivial dynamics. Recent studies have pointed out to the myocyte enhancing factor 2C (MEF2C, also known as MADS box transcription enhancer factor 2, polypeptide C as being one of such master regulators involved in the pathogenesis of primary breast cancer. In this work, we perform an integrative genomic analysis of the transcriptional regulation activity of MEF2C and its target genes to evaluate to what extent are these molecules inducing collective responses leading to gene expression deregulation and carcinogenesis. We also analyzed a number of induced dynamic responses, in particular those associated with transcriptional bursts, and nonlinear cascading to evaluate the influence they may have in malignant phenotypes and cancer. Received: 20 Novembre 2014, Accepted: 24 June 2015; Edited by: C. A. Condat, G. J. Sibona; DOI: http://dx.doi.org/10.4279/PIP.070011 Cite as: E Hernández-Lemus, K Baca-López, R Lemus, R García-Herrera, Papers in Physics 7, 070011 (2015

  13. CRISPR loci reveal networks of gene exchange in archaea

    Directory of Open Access Journals (Sweden)

    Brodt Avital

    2011-12-01

    Full Text Available Abstract Background CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Results Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. Conclusions CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. Open peer review This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten

  14. Study on Consumption Preference of Tourists of Ancient Town Zhouzhuang under the Perspective of Network Traveb%基于网络游记视角下古镇周庄旅游者消费偏好研究

    Institute of Scientific and Technical Information of China (English)

    王新亮

    2011-01-01

    Taking network travels as study samples, then, based on the characteristics of tourists of ancient town, distribution of travel time,diversity of travel motivation, travel diet, transport, accommodation, shopping content and preferences, travel evaluation and tourism satisfaction, the new consumption trends of tourists of ancient town were analyzed.%以网络游记为研究样本,通过古镇旅游者特征,旅游时间分布,旅游动机多样性,旅游饮食、交通、住宿、购物内容及偏好,旅游评价和旅游满意度,分析古镇旅游者消费新动向.

  15. MicroRNAs and deregulated gene expression networks in neurodegeneration.

    Science.gov (United States)

    Sonntag, Kai-Christian

    2010-06-18

    Neurodegeneration is characterized by the progressive loss of neuronal cell types in the nervous system. Although the main cause of cell dysfunction and death in many neurodegenerative diseases is not known, there is increasing evidence that their demise is a result of a combination of genetic and environmental factors which affect key signaling pathways in cell function. This view is supported by recent observations that disease-compromised cells in late-stage neurodegeneration exhibit profound dysregulation of gene expression. MicroRNAs (miRNAs) introduce a novel concept of regulatory control over gene expression and there is increasing evidence that they play a profound role in neuronal cell identity as well as multiple aspects of disease pathogenesis. Here, we review the molecular properties of brain cells derived from patients with neurodegenerative diseases, and discuss how deregulated miRNA/mRNA expression networks could be a mechanism in neurodegeneration. In addition, we emphasize that the dysfunction of these regulatory networks might overlap between different cell systems and suggest that miRNA functions might be common between neurodegeneration and other disease entities.

  16. Data identification for improving gene network inference using computational algebra.

    Science.gov (United States)

    Dimitrova, Elena; Stigler, Brandilyn

    2014-11-01

    Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.

  17. Understanding gene expression in coronary artery disease through global profiling, network analysis and independent validation of key candidate genes

    Indian Academy of Sciences (India)

    Prathima Arvind; Shanker Jayashree; Srikarthika Jambunathan; Jiny Nair; Vijay V. Kakkar

    2015-12-01

    Molecular mechanism underlying the patho-physiology of coronary artery disease (CAD) is complex. We used global expression profiling combined with analysis of biological network to dissect out potential genes and pathways associated with CAD in a representative case–control Asian Indian cohort. We initially performed blood transcriptomics profiling in 20 subjects, including 10 CAD patients and 10 healthy controls on the Agilent microarray platform. Data was analysed with Gene Spring Gx12.5, followed by network analysis using David v 6.7 and Reactome databases. The most significant differentially expressed genes from microarray were independently validated by real time PCR in 97 cases and 97 controls. A total of 190 gene transcripts showed significant differential expression (fold change > 2, P < 0.05) between the cases and the controls of which 142 genes were upregulated and 48 genes were downregulated. Genes associated with inflammation, immune response, cell regula- tion, proliferation and apoptotic pathways were enriched, while inflammatory and immune response genes were displayed as hubs in the network, having greater number of interactions with the neighbouring genes. Expression of 1/2/3, 8, 1, 2, 69, , , 4, 42, 58, and 42 genes were independently validated; 1/2/3 and 8 showed >8-fold higher expression in cases relative to the controls implying their important role in CAD. In conclusion, global gene expression profiling combined with network analysis can help in identifying key genes and pathways for CAD.

  18. Gene cloning of the 18S rRNA of an ancient viable moss from the permafrost of northeastern Siberia

    Science.gov (United States)

    Marsic, Damien; Hoover, Richard B.; Gilichinsky, David A.; Ng, Joseph D.

    1999-12-01

    A moss plant dating as much as 40,000 years old was collected from the permafrost of the Kolyma Lowlands of Northeastern Siberia. The plant tissue was revived and cultured for the extraction of its genomic DNA. Using the polymerase chain reaction technique, the 18S ribosomal RNA gene was cloned and its sequence studied. Comparative sequence analysis of the cloned ribosomal DNA to other known 18S RNA showed very high sequence identity and was revealed to be closest to the moss specie, Aulacomnium turgidum. The results of this study also show the ability of biological organisms to rest dormant in deep frozen environments where they can be revived and cultured under favorable conditions. This is significant in the notion that celestial icy bodies can be media to preserve biological function and genetic material during long term storage or transport.

  19. Gene, protein, and network of male sterility in rice.

    Science.gov (United States)

    Wang, Kun; Peng, Xiaojue; Ji, Yanxiao; Yang, Pingfang; Zhu, Yingguo; Li, Shaoqing

    2013-01-01

    Rice is one of the most important model crop plants whose heterosis has been well-exploited in commercial hybrid seed production via a variety of types of male-sterile lines. Hybrid rice cultivation area is steadily expanding around the world, especially in Southern Asia. Characterization of genes and proteins related to male sterility aims to understand how and why the male sterility occurs, and which proteins are the key players for microspores abortion. Recently, a series of genes and proteins related to cytoplasmic male sterility (CMS), photoperiod-sensitive male sterility, self-incompatibility, and other types of microspores deterioration have been characterized through genetics or proteomics. Especially the latter, offers us a powerful and high throughput approach to discern the novel proteins involving in male-sterile pathways which may help us to breed artificial male-sterile system. This represents an alternative tool to meet the critical challenge of further development of hybrid rice. In this paper, we reviewed the recent developments in our understanding of male sterility in rice hybrid production across gene, protein, and integrated network levels, and also, present a perspective on the engineering of male-sterile lines for hybrid rice production.

  20. Gene, protein and network of male sterility in rice

    Directory of Open Access Journals (Sweden)

    Wang eKun

    2013-04-01

    Full Text Available Rice is one of the most important model crop plants whose heterosis has been well exploited in commercial hybrid seed production via a variety of types of male sterile lines. Hybrid rice cultivation area is steadily expanding around the world, especially in Southern Asia. Characterization of genes and proteins related to male sterility aims to understand how and why the male sterility occurs, and which proteins are the key players for microspores abortion. Recently, a series of genes and proteins related to cytoplasmic male sterility, photoperiod sensitive male sterility, self-incompatibility and other types of microspores deterioration have been characterized through genetics or proteomics. Especially the latter, offers us a powerful and high throughput approach to discern the novel proteins involving in male-sterile pathways which may help us to breed artificial male-sterile system. This represents an alternative tool to meet the critical challenge of further development of hybrid rice. In this paper, we reviewed the recent developments in our understanding of male sterility in rice hybrid production across gene, protein and integrated network levels, and also, present a perspective on the engineering of male sterile lines for hybrid rice production.

  1. The impact of gene expression variation on the robustness and evolvability of a developmental gene regulatory network.

    Directory of Open Access Journals (Sweden)

    David A Garfield

    2013-10-01

    Full Text Available Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear, allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.

  2. FUMET: A fuzzy network module extraction technique for gene expression data

    Indian Academy of Sciences (India)

    Priyakshi Mahanta; Hasin Afzal Ahmed; Dhruba Kumar Bhattacharyya; Ashish Ghosh

    2014-06-01

    Construction of co-expression network and extraction of network modules have been an appealing area of bioinformatics research. This article presents a co-expression network construction and a biologically relevant network module extraction technique based on fuzzy set theoretic approach. The technique is able to handle both positive and negative correlations among genes. The constructed network for some benchmark gene expression datasets have been validated using topological internal and external measures. The effectiveness of network module extraction technique has been established in terms of well-known p-value, Q-value and topological statistics.

  3. Morphology, morphogenesis and gene sequence of a freshwater ciliate, Pseudourostyla cristata (Ciliophora, Urostyloidea) from the ancient Lake Biwa, Japan.

    Science.gov (United States)

    Chen, Xumiao; Li, Zicong; Hu, Xiaozhong; Kusuoka, Yasushi

    2010-01-01

    The urostyloid freshwater ciliate Pseudourostyla cristata was recorded for the first time from Lake Biwa, a 4-million-year-old lake located in Shiga Prefecture, Japan. Its morphology and morphogenesis were investigated using live observation and protargol impregnation, and the SSU ribosomal RNA gene was sequenced. Based on the current observations and previous descriptions, this species is readily recognized mainly by the following characters: body slender or broadly oval to elliptical, and dark grey in color; size in vivo about 170-400 x 40-150 microm; pellicle flexible and contractile, with extrusomes forming a hyaline seam underneath; ciliature comprising about 60-130 adoral membranelles, usually 1 buccal cirrus, 20-24 frontal, 2 frontoterminal, 17-26 pairs of midventral, and 5-16 transverse cirri, 4-6 left and 4-5 right marginal rows, and 8-10 dorsal kineties; 15-83 macronuclear nodules and 2-9 micronuclei; freshwater habitat. The main morphogenetic developments are: (1) the oral primordium for the proter originates de novo on the dorsal wall of the buccal cavity, and the dedifferentiated undulating membranes and some parental proximal membranelles join in the primordial development; the old adoral zone will be partly replaced by new structures; (2) the oral primordium for the opisthe occurs epiapokinetally left of the midventral complex between the adoral zone and the transverse cirri; (3) the fronto-midventral transverse cirral (FVT) anlagen develop separately in both dividers by dedifferentiation of most of the midventral cirri; (4) the single buccal cirrus is generated from the posterior end of FVT anlage II; (5) the leftmost frontal cirrus is derived from the anterior end of the undulating membranes anlage (FVT anlage I); (6) the marginal rows of each side are formed from a single anlage which arises within the rightmost row; (7) the dorsal kineties develop by intrakinetal basal body proliferation; and (8) the most posterior FVT anlage contributes the two

  4. Modular reorganization of the global network of gene regulatory interactions during perinatal human brain development

    OpenAIRE

    Monzón-Sandoval, Jimena; Castillo-Morales, Atahualpa; Urrutia, Araxi O.; Gutierrez, Humberto

    2016-01-01

    Background During early development of the nervous system, gene expression patterns are known to vary widely depending on the specific developmental trajectories of different structures. Observable changes in gene expression profiles throughout development are determined by an underlying network of precise regulatory interactions between individual genes. Elucidating the organizing principles that shape this gene regulatory network is one of the central goals of developmental biology. Whether...

  5. The R package FANet: sparse factor analysis model for high dimensional gene co-expression networks

    OpenAIRE

    Blum, Anne; Houee-Bigot, Magalie; Lagarrigue, Sandrine; Causeur, David

    2014-01-01

    Inference on gene regulatory networks from high-throughput expression data turns out to be one of the main current challenges in systems biology. Such interaction networks are very insightful for the deep understanding of biological relationships between genes. In particular, a functional characterization of gene modules of highly interacting genes enables the identification of biological processes underlying complex traits as diseases. Inference on this dependence structure shall...

  6. Nuclear gene sequences confirm an ancient link between New Zealand's short-tailed bat and South American noctilionoid bats.

    Science.gov (United States)

    Teeling, Emma C; Madsen, Ole; Murphy, William J; Springer, Mark S; O'Brien, Stephen J

    2003-08-01

    Molecular and morphological hypotheses disagree on the phylogenetic position of New Zealand's short-tailed bat Mystacina tuberculata. Most morphological analyses place Mystacina in the superfamily Vespertilionoidea, whereas molecular studies unite Mystacina with the Neotropical noctilionoids and imply a shared Gondwanan history. To date, competing hypotheses for the placement of Mystacina have not been addressed with a large concatenation of nuclear protein sequences. We investigated this problem using 7.1kb of nuclear sequence data that included segments from five nuclear protein-coding genes for representatives of 14 bat families and six laurasiatherian outgroups. We employed the Thorne/Kishino method of molecular dating, allowing for simultaneous constraints from the fossil record and varying rates of molecular evolution on different branches on the phylogenetic tree, to estimate basal divergence times within key chiropteran clades. Maximum likelihood, minimum evolution, maximum parsimony, and Bayesian posterior probabilities all provide robust support for the association of Mystacina with the South American noctilionoids. The basal divergence within Chiroptera was estimated at 67mya and the mystacinid/noctilionoid split was calculated at 47mya. Although the mystacinid lineage is too young to have originated in New Zealand before it split from the other Gondwanan landmasses (80mya), the exact geographic origin of these lineages is still uncertain and will not be answered until more fossils are found. It is most probable that Mystacina dispersed from Australia to New Zealand while other noctilionoid bats either remained in or dispersed to South America.

  7. Mining susceptibility gene modules and disease risk genes from SNP data by combining network topological properties with support vector regression.

    Science.gov (United States)

    Hua, Lin; Zhou, Ping; Liu, Hong; Li, Lin; Yang, Zheng; Liu, Zhi-cheng

    2011-11-21

    Genome-wide association study is a powerful approach to identify disease risk loci. However, the molecular regulatory mechanisms for most complex diseases are still not well understood. Therefore, further investigating the interplay between genetic factors and biological networks is important for elucidating the molecular mechanisms of complex diseases. Here, we proposed a novel framework to identify susceptibility gene modules and disease risk genes by combining network topological properties with support vector regression from single nucleotide polymorphism (SNP) level. We assigned risk SNPs to genes using the University of California at Santa Cruz (UCSC) genome database, and then mapped these genes to protein-protein interaction (PPI) networks. The gene modules implicated by hub genes were extracted using the PPI networks and the topological property was analyzed for these gene modules. For each gene module, risk feature genes were determined by topological property analysis and support vector regression. As a result, five shared risk feature genes, CD80, EGFR, FN1, GSK3B and TRAF6 were found and proven to be associated with rheumatoid arthritis by previous reports. Our approach showed a good performance in comparison with other approaches and can be used for prioritizing candidate genes associated with complex diseases.

  8. Construction of citrus gene coexpression networks from microarray data using random matrix theory.

    Science.gov (United States)

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus.

  9. Using evolutionary conserved modules in gene networks as a strategy to leverage high throughput gene expression queries.

    Directory of Open Access Journals (Sweden)

    Jeanne M Serb

    Full Text Available BACKGROUND: Large-scale gene expression studies have not yielded the expected insight into genetic networks that control complex processes. These anticipated discoveries have been limited not by technology, but by a lack of effective strategies to investigate the data in a manageable and meaningful way. Previous work suggests that using a pre-determined seed-network of gene relationships to query large-scale expression datasets is an effective way to generate candidate genes for further study and network expansion or enrichment. Based on the evolutionary conservation of gene relationships, we test the hypothesis that a seed network derived from studies of retinal cell determination in the fly, Drosophila melanogaster, will be an effective way to identify novel candidate genes for their role in mouse retinal development. METHODOLOGY/PRINCIPAL FINDINGS: Our results demonstrate that a number of gene relationships regulating retinal cell differentiation in the fly are identifiable as pairwise correlations between genes from developing mouse retina. In addition, we demonstrate that our extracted seed-network of correlated mouse genes is an effective tool for querying datasets and provides a context to generate hypotheses. Our query identified 46 genes correlated with our extracted seed-network members. Approximately 54% of these candidates had been previously linked to the developing brain and 33% had been previously linked to the developing retina. Five of six candidate genes investigated further were validated by experiments examining spatial and temporal protein expression in the developing retina. CONCLUSIONS/SIGNIFICANCE: We present an effective strategy for pursuing a systems biology approach that utilizes an evolutionary comparative framework between two model organisms, fly and mouse. Future implementation of this strategy will be useful to determine the extent of network conservation, not just gene conservation, between species and will

  10. NetDecoder: a network biology platform that decodes context-specific biological networks and gene activities.

    Science.gov (United States)

    da Rocha, Edroaldo Lummertz; Ung, Choong Yong; McGehee, Cordelia D; Correia, Cristina; Li, Hu

    2016-06-02

    The sequential chain of interactions altering the binary state of a biomolecule represents the 'information flow' within a cellular network that determines phenotypic properties. Given the lack of computational tools to dissect context-dependent networks and gene activities, we developed NetDecoder, a network biology platform that models context-dependent information flows using pairwise phenotypic comparative analyses of protein-protein interactions. Using breast cancer, dyslipidemia and Alzheimer's disease as case studies, we demonstrate NetDecoder dissects subnetworks to identify key players significantly impacting cell behaviour specific to a given disease context. We further show genes residing in disease-specific subnetworks are enriched in disease-related signalling pathways and information flow profiles, which drive the resulting disease phenotypes. We also devise a novel scoring scheme to quantify key genes-network routers, which influence many genes, key targets, which are influenced by many genes, and high impact genes, which experience a significant change in regulation. We show the robustness of our results against parameter changes. Our network biology platform includes freely available source code (http://www.NetDecoder.org) for researchers to explore genome-wide context-dependent information flow profiles and key genes, given a set of genes of particular interest and transcriptome data. More importantly, NetDecoder will enable researchers to uncover context-dependent drug targets.

  11. An efficient approach of attractor calculation for large-scale Boolean gene regulatory networks.

    Science.gov (United States)

    He, Qinbin; Xia, Zhile; Lin, Bin

    2016-11-07

    Boolean network models provide an efficient way for studying gene regulatory networks. The main dynamics of a Boolean network is determined by its attractors. Attractor calculation plays a key role for analyzing Boolean gene regulatory networks. An approach of attractor calculation was proposed in this study, which improved the predecessor-based approach. Furthermore, the proposed approach combined with the identification of constant nodes and simplified Boolean networks to accelerate attractor calculation. The proposed algorithm is effective to calculate all attractors for large-scale Boolean gene regulatory networks. If the average degree of the network is not too large, the algorithm can get all attractors of a Boolean network with dozens or even hundreds of nodes.

  12. Spectral analysis of Gene co-expression network of Zebrafish

    CERN Document Server

    Jalan, S; Bhojwani, J; Li, B; Zhang, L; Lan, S H; Gong, Z

    2012-01-01

    We analyze the gene expression data of Zebrafish under the combined framework of complex networks and random matrix theory. The nearest neighbor spacing distribution of the corresponding matrix spectra follows random matrix predictions of Gaussian orthogonal statistics. Based on the eigenvector analysis we can divide the spectra into two parts, first part for which the eigenvector localization properties match with the random matrix theory predictions, and the second part for which they show deviation from the theory and hence are useful to understand the system dependent properties. Spectra with the localized eigenvectors can be characterized into three groups based on the eigenvalues. We explore the position of localized nodes from these different categories. Using an overlap measure, we find that the top contributing nodes in the different groups carry distinguished structural features. Furthermore, the top contributing nodes of the different localized eigenvectors corresponding to the lower eigenvalue reg...

  13. Effects of bidirectional regulation on noises in gene networks.

    Science.gov (United States)

    Zheng, Xiudeng; Tao, Yi

    2010-03-14

    To investigate the effects of bidirectional regulation on the noise in protein concentration, a theoretical and simple three-gene network model is considered. The basic idea behind this model is from Paulsson's proposition (J. Paulsson, Phys. Life Rev. 2005, 2, 157-175), where the synthesis and degradation of a mRNA species corresponding to a target protein are regulated directly and indirectly by a certain sigma-factor, and a random increase in the concentration of the sigma-factor should increase both the synthesis and degradation rates of the mRNA species (bidirectional regulation). Using the standard Omega-expansion technique (linear noise approximation) and Monte Carlo simulation, our main results show clearly that for the steady-state statistics the effects of the noise of the sigma-factor on the stochastic fluctuation of the target protein could partially cancel out.

  14. Analysis of deterministic cyclic gene regulatory network models with delays

    CERN Document Server

    Ahsen, Mehmet Eren; Niculescu, Silviu-Iulian

    2015-01-01

    This brief examines a deterministic, ODE-based model for gene regulatory networks (GRN) that incorporates nonlinearities and time-delayed feedback. An introductory chapter provides some insights into molecular biology and GRNs. The mathematical tools necessary for studying the GRN model are then reviewed, in particular Hill functions and Schwarzian derivatives. One chapter is devoted to the analysis of GRNs under negative feedback with time delays and a special case of a homogenous GRN is considered. Asymptotic stability analysis of GRNs under positive feedback is then considered in a separate chapter, in which conditions leading to bi-stability are derived. Graduate and advanced undergraduate students and researchers in control engineering, applied mathematics, systems biology and synthetic biology will find this brief to be a clear and concise introduction to the modeling and analysis of GRNs.

  15. Gene network and familial analyses uncover a gene network involving Tbx5/Osr1/Pcsk6 interaction in the second heart field for atrial septation.

    Science.gov (United States)

    Zhang, Ke K; Xiang, Menglan; Zhou, Lun; Liu, Jielin; Curry, Nathan; Heine Suñer, Damian; Garcia-Pavia, Pablo; Zhang, Xiaohua; Wang, Qin; Xie, Linglin

    2016-03-15

    Atrial septal defects (ASDs) are a common human congenital heart disease (CHD) that can be induced by genetic abnormalities. Our previous studies have demonstrated a genetic interaction between Tbx5 and Osr1 in the second heart field (SHF) for atrial septation. We hypothesized that Osr1 and Tbx5 share a common signaling networking and downstream targets for atrial septation. To identify this molecular networks, we acquired the RNA-Seq transcriptome data from the posterior SHF of wild-type, Tbx5(+/) (-), Osr1(+/-), Osr1(-/-) and Tbx5(+/-)/Osr1(+/-) mutant embryos. Gene set analysis was used to identify the Kyoto Encyclopedia of Genes and Genomes pathways that were affected by the doses of Tbx5 and Osr1. A gene network module involving Tbx5 and Osr1 was identified using a non-parametric distance metric, distance correlation. A subset of 10 core genes and gene-gene interactions in the network module were validated by gene expression alterations in posterior second heart field (pSHF) of Tbx5 and Osr1 transgenic mouse embryos, a time-course gene expression change during P19CL6 cell differentiation. Pcsk6 was one of the network module genes that were linked to Tbx5. We validated the direct regulation of Tbx5 on Pcsk6 using immunohistochemical staining of pSHF, ChIP-quantitative polymerase chain reaction and luciferase reporter assay. Importantly, we identified Pcsk6 as a novel gene associated with ASD via a human genotyping study of an ASD family. In summary, our study implicated a gene network involving Tbx5, Osr1 and Pcsk6 interaction in SHF for atrial septation, providing a molecular framework for understanding the role of Tbx5 in CHD ontogeny.

  16. Quality assurance of the gene ontology using abstraction networks.

    Science.gov (United States)

    Ochs, Christopher; Perl, Yehoshua; Halper, Michael; Geller, James; Lomax, Jane

    2016-06-01

    The gene ontology (GO) is used extensively in the field of genomics. Like other large and complex ontologies, quality assurance (QA) efforts for GO's content can be laborious and time consuming. Abstraction networks (AbNs) are summarization networks that reveal and highlight high-level structural and hierarchical aggregation patterns in an ontology. They have been shown to successfully support QA work in the context of various ontologies. Two kinds of AbNs, called the area taxonomy and the partial-area taxonomy, are developed for GO hierarchies and derived specifically for the biological process (BP) hierarchy. Within this framework, several QA heuristics, based on the identification of groups of anomalous terms which exhibit certain taxonomy-defined characteristics, are introduced. Such groups are expected to have higher error rates when compared to other terms. Thus, by focusing QA efforts on anomalous terms one would expect to find relatively more erroneous content. By automatically identifying these potential problem areas within an ontology, time and effort will be saved during manual reviews of GO's content. BP is used as a testbed, with samples of three kinds of anomalous BP terms chosen for a taxonomy-based QA review. Additional heuristics for QA are demonstrated. From the results of this QA effort, it is observed that different kinds of inconsistencies in the modeling of GO can be exposed with the use of the proposed heuristics. For comparison, the results of QA work on a sample of terms chosen from GO's general population are presented.

  17. Evolution of gene network activity by tuning the strength of negative-feedback regulation.

    Science.gov (United States)

    Peng, Weilin; Liu, Ping; Xue, Yuan; Acar, Murat

    2015-02-11

    Despite the examples of protein evolution via mutations in coding sequences, we have very limited understanding on gene network evolution via changes in cis-regulatory elements. Using the galactose network as a model, here we show how the regulatory promoters of the network contribute to the evolved network activity between two yeast species. In Saccharomyces cerevisiae, we combinatorially replace all regulatory network promoters by their counterparts from Saccharomyces paradoxus, measure the resulting network inducibility profiles, and model the results. Lowering relative strength of GAL80-mediated negative feedback by replacing GAL80 promoter is necessary and sufficient to have high network inducibility levels as in S. paradoxus. This is achieved by increasing OFF-to-ON phenotypic switching rates. Competitions performed among strains with or without the GAL80 promoter replacement show strong relationships between network inducibility and fitness. Our results support the hypothesis that gene network activity can evolve by optimizing the strength of negative-feedback regulation.

  18. Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors

    Directory of Open Access Journals (Sweden)

    Frank Emmert-Streib

    2013-02-01

    Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.

  19. Gene Regulatory Networks from Multifactorial Perturbations Using Graphical Lasso: Application to the DREAM4 Challenge

    NARCIS (Netherlands)

    Menéndez, P.; Kourmpetis, Y.I.A.; Braak, ter C.J.F.; Eeuwijk, van F.A.

    2010-01-01

    A major challenge in the field of systems biology consists of predicting gene regulatory networks based on different training data. Within the DREAM4 initiative, we took part in the multifactorial sub-challenge that aimed to predict gene regulatory networks of size 100 from training data consisting

  20. From Gene Regulation to Gene Function: Regulatory Networks in Bacillus Subtilis

    Directory of Open Access Journals (Sweden)

    Ivan Moszer

    2006-04-01

    Full Text Available Bacillus subtilis is a sporulating Gram-positive bacterium that lives primarily in the soil and associated water sources. The publication of the B. subtilis genome sequence and subsequent systematic functional analysis and gene regulation programmes, together with an extensive understanding of its biochemistry and physiology, makes this micro-organism a prime candidate in which to model regulatory networks in silico. In this paper we discuss combined molecular biological and bioinformatical approaches that are being developed to model this organism’s responses to changes in its environment.

  1. CCor: A whole genome network-based similarity measure between two genes.

    Science.gov (United States)

    Hu, Yiming; Zhao, Hongyu

    2016-12-01

    Measuring the similarity between genes is often the starting point for building gene regulatory networks. Most similarity measures used in practice only consider pairwise information with a few also consider network structure. Although theoretical properties of pairwise measures are well understood in the statistics literature, little is known about their statistical properties of those similarity measures based on network structure. In this article, we consider a new whole genome network-based similarity measure, called CCor, that makes use of information of all the genes in the network. We derive a concentration inequality of CCor and compare it with the commonly used Pearson correlation coefficient for inferring network modules. Both theoretical analysis and real data example demonstrate the advantages of CCor over existing measures for inferring gene modules.

  2. Targeting c-Myc-activated genes with a correlation method: Detection of global changes in large gene expression network dynamics

    Science.gov (United States)

    Remondini, D.; O'Connell, B.; Intrator, N.; Sedivy, J. M.; Neretti, N.; Castellani, G. C.; Cooper, L. N.

    2005-01-01

    This work studies the dynamics of a gene expression time series network. The network, which is obtained from the correlation of gene expressions, exhibits global dynamic properties that emerge after a cell state perturbation. The main features of this network appear to be more robust when compared with those obtained with a network obtained from a linear Markov model. In particular, the network properties strongly depend on the exact time sequence relationships between genes and are destroyed by random temporal data shuffling. We discuss in detail the problem of finding targets of the c-myc protooncogene, which encodes a transcriptional regulator whose inappropriate expression has been correlated with a wide array of malignancies. The data used for network construction are a time series of gene expression, collected by microarray analysis of a rat fibroblast cell line expressing a conditional Myc-estrogen receptor oncoprotein. We show that the correlation-based model can establish a clear relationship between network structure and the cascade of c-myc-activated genes. PMID:15867157

  3. Identifying Gene Regulatory Networks in Arabidopsis by In Silico Prediction, Yeast-1-Hybrid, and Inducible Gene Profiling Assays.

    Science.gov (United States)

    Sparks, Erin E; Benfey, Philip N

    2016-01-01

    A system-wide understanding of gene regulation will provide deep insights into plant development and physiology. In this chapter we describe a threefold approach to identify the gene regulatory networks in Arabidopsis thaliana that function in a specific tissue or biological process. Since no single method is sufficient to establish comprehensive and high-confidence gene regulatory networks, we focus on the integration of three approaches. First, we describe an in silico prediction method of transcription factor-DNA binding, then an in vivo assay of transcription factor-DNA binding by yeast-1-hybrid and lastly the identification of co-expression clusters by transcription factor induction in planta. Each of these methods provides a unique tool to advance our understanding of gene regulation, and together provide a robust model for the generation of gene regulatory networks.

  4. Prediction of disease-gene-drug relationships following a differential network analysis.

    Science.gov (United States)

    Zickenrott, S; Angarica, V E; Upadhyaya, B B; del Sol, A

    2016-01-01

    Great efforts are being devoted to get a deeper understanding of disease-related dysregulations, which is central for introducing novel and more effective therapeutics in the clinics. However, most human diseases are highly multifactorial at the molecular level, involving dysregulation of multiple genes and interactions in gene regulatory networks. This issue hinders the elucidation of disease mechanism, including the identification of disease-causing genes and regulatory interactions. Most of current network-based approaches for the study of disease mechanisms do not take into account significant differences in gene regulatory network topology between healthy and disease phenotypes. Moreover, these approaches are not able to efficiently guide database search for connections between drugs, genes and diseases. We propose a differential network-based methodology for identifying candidate target genes and chemical compounds for reverting disease phenotypes. Our method relies on transcriptomics data to reconstruct gene regulatory networks corresponding to healthy and disease states separately. Further, it identifies candidate genes essential for triggering the reversion of the disease phenotype based on network stability determinants underlying differential gene expression. In addition, our method selects and ranks chemical compounds targeting these genes, which could be used as therapeutic interventions for complex diseases.

  5. A consensus network of gene regulatory factors in the human frontal lobe

    Directory of Open Access Journals (Sweden)

    Stefano eBerto

    2016-03-01

    Full Text Available Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID or autism spectrum disorders (ASD. Because many of these genes are gene regulatory factors (GRFs we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies.

  6. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe.

    Science.gov (United States)

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies.

  7. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  8. Co-regulation of metabolic genes is better explained by flux coupling than by network distance.

    Directory of Open Access Journals (Sweden)

    Richard A Notebaart

    2008-01-01

    Full Text Available To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools.

  9. Mapping and characterization of two relevance networks from SNP and gene levels

    Institute of Scientific and Technical Information of China (English)

    Wei Jiang; Lijie Zhang; Bo Na; Lihong Wang; Jiankai Xu; Xia Li; Yadong Wang; Shaoqi Rao

    2009-01-01

    Variations of gene expression and DNA sequence are genetically associated.The goal of this study was to build genetic networks to map from SNPs to gene expressions and to characterize the two different kinds of networks.We employed mutual information to evaluate the strength of SNP-SNP and gene-gene associations based on SNPs identity by descent (IBD) data and differences of gene expressions.We applied the approach to one dataset of Genetics of Gene Expression in Humans,and discovered that both the SNP relevance network and the gene relevance network approximated the scale-free topology.We also found that 12.09% of SNP-SNP interactions matched 24.49% of gene-gene interactions,which was consistent with that of the previous studies.Finally,we identified 49 hub SNPs and 115 hub genes in their relevance networks,in which 27 hub SNPs were associated with 25 hub genes.(C) 2009 National Natural Science Foundation of China and Chinese Academy of Sciences.Published by Elsevier Limited and Science in China Press.All rights reserved.

  10. Bagging statistical network inference from large-scale gene expression data.

    OpenAIRE

    Ricardo de Matos Simoes; Frank Emmert-Streib

    2012-01-01

    Modern biology and medicine aim at hunting molecular and cellular causes of biological functions and diseases. Gene regulatory networks (GRN) inferred from gene expression data are considered an important aid for this research by providing a map of molecular interactions. Hence, GRNs have the potential enabling and enhancing basic as well as applied research in the life sciences. In this paper, we introduce a new method called BC3NET for inferring causal gene regulatory networks from large-sc...

  11. Gene regulatory network interactions in sea urchin endomesoderm induction.

    Directory of Open Access Journals (Sweden)

    Aditya J Sethi

    2009-02-01

    Full Text Available A major goal of contemporary studies of embryonic development is to understand large sets of regulatory changes that accompany the phenomenon of embryonic induction. The highly resolved sea urchin pregastrular endomesoderm-gene regulatory network (EM-GRN provides a unique framework to study the global regulatory interactions underlying endomesoderm induction. Vegetal micromeres of the sea urchin embryo constitute a classic endomesoderm signaling center, whose potential to induce archenteron formation from presumptive ectoderm was demonstrated almost a century ago. In this work, we ectopically activate the primary mesenchyme cell-GRN (PMC-GRN that operates in micromere progeny by misexpressing the micromere determinant Pmar1 and identify the responding EM-GRN that is induced in animal blastomeres. Using localized loss-of -function analyses in conjunction with expression of endo16, the molecular definition of micromere-dependent endomesoderm specification, we show that the TGFbeta cytokine, ActivinB, is an essential component of this induction in blastomeres that emit this signal, as well as in cells that respond to it. We report that normal pregastrular endomesoderm specification requires activation of the Pmar1-inducible subset of the EM-GRN by the same cytokine, strongly suggesting that early micromere-mediated endomesoderm specification, which regulates timely gastrulation in the sea urchin embryo, is also ActivinB dependent. This study unexpectedly uncovers the existence of an additional uncharacterized micromere signal to endomesoderm progenitors, significantly revising existing models. In one of the first network-level characterizations of an intercellular inductive phenomenon, we describe an important in vivo model of the requirement of ActivinB signaling in the earliest steps of embryonic endomesoderm progenitor specification.

  12. Genetic dissection of acute ethanol responsive gene networks in prefrontal cortex: functional and mechanistic implications.

    Directory of Open Access Journals (Sweden)

    Aaron R Wolen

    Full Text Available Individual differences in initial sensitivity to ethanol are strongly related to the heritable risk of alcoholism in humans. To elucidate key molecular networks that modulate ethanol sensitivity we performed the first systems genetics analysis of ethanol-responsive gene expression in brain regions of the mesocorticolimbic reward circuit (prefrontal cortex, nucleus accumbens, and ventral midbrain across a highly diverse family of 27 isogenic mouse strains (BXD panel before and after treatment with ethanol.Acute ethanol altered the expression of ~2,750 genes in one or more regions and 400 transcripts were jointly modulated in all three. Ethanol-responsive gene networks were extracted with a powerful graph theoretical method that efficiently summarized ethanol's effects. These networks correlated with acute behavioral responses to ethanol and other drugs of abuse. As predicted, networks were heavily populated by genes controlling synaptic transmission and neuroplasticity. Several of the most densely interconnected network hubs, including Kcnma1 and Gsk3β, are known to influence behavioral or physiological responses to ethanol, validating our overall approach. Other major hub genes like Grm3, Pten and Nrg3 represent novel targets of ethanol effects. Networks were under strong genetic control by variants that we mapped to a small number of chromosomal loci. Using a novel combination of genetic, bioinformatic and network-based approaches, we identified high priority cis-regulatory candidate genes, including Scn1b, Gria1, Sncb and Nell2.The ethanol-responsive gene networks identified here represent a previously uncharacterized intermediate phenotype between DNA variation and ethanol sensitivity in mice. Networks involved in synaptic transmission were strongly regulated by ethanol and could contribute to behavioral plasticity seen with chronic ethanol. Our novel finding that hub genes and a small number of loci exert major influence over the ethanol

  13. Cell cycle networks link gene expression dysregulation, mutation, and brain maldevelopment in autistic toddlers.

    Science.gov (United States)

    Pramparo, Tiziano; Lombardo, Michael V; Campbell, Kathleen; Barnes, Cynthia Carter; Marinero, Steven; Solso, Stephanie; Young, Julia; Mayo, Maisi; Dale, Anders; Ahrens-Barbeau, Clelia; Murray, Sarah S; Lopez, Linda; Lewis, Nathan; Pierce, Karen; Courchesne, Eric

    2015-12-14

    Genetic mechanisms underlying abnormal early neural development in toddlers with Autism Spectrum Disorder (ASD) remain uncertain due to the impossibility of direct brain gene expression measurement during critical periods of early development. Recent findings from a multi-tissue study demonstrated high expression of many of the same gene networks between blood and brain tissues, in particular with cell cycle functions. We explored relationships between blood gene expression and total brain volume (TBV) in 142 ASD and control male toddlers. In control toddlers, TBV variation significantly correlated with cell cycle and protein folding gene networks, potentially impacting neuron number and synapse development. In ASD toddlers, their correlations with brain size were lost as a result of considerable changes in network organization, while cell adhesion gene networks significantly correlated with TBV variation. Cell cycle networks detected in blood are highly preserved in the human brain and are upregulated during prenatal states of development. Overall, alterations were more pronounced in bigger brains. We identified 23 candidate genes for brain maldevelopment linked to 32 genes frequently mutated in ASD. The integrated network includes genes that are dysregulated in leukocyte and/or postmortem brain tissue of ASD subjects and belong to signaling pathways regulating cell cycle G1/S and G2/M phase transition. Finally, analyses of the CHD8 subnetwork and altered transcript levels from an independent study of CHD8 suppression further confirmed the central role of genes regulating neurogenesis and cell adhesion processes in ASD brain maldevelopment.

  14. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering.

    Science.gov (United States)

    Gao, Chuan; McDowell, Ian C; Zhao, Shiwen; Brown, Christopher D; Engelhardt, Barbara E

    2016-07-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues.

  15. Incorporating gene co-expression network in identification of cancer prognosis markers

    Directory of Open Access Journals (Sweden)

    Li Yang

    2010-05-01

    Full Text Available Abstract Background Extensive biomedical studies have shown that clinical and environmental risk factors may not have sufficient predictive power for cancer prognosis. The development of high-throughput profiling technologies makes it possible to survey the whole genome and search for genomic markers with predictive power. Many existing studies assume the interchangeability of gene effects and ignore the coordination among them. Results We adopt the weighted co-expression network to describe the interplay among genes. Although there are several different ways of defining gene networks, the weighted co-expression network may be preferred because of its computational simplicity, satisfactory empirical performance, and because it does not demand additional biological experiments. For cancer prognosis studies with gene expression measurements, we propose a new marker selection method that can properly incorporate the network connectivity of genes. We analyze six prognosis studies on breast cancer and lymphoma. We find that the proposed approach can identify genes that are significantly different from those using alternatives. We search published literature and find that genes identified using the proposed approach are biologically meaningful. In addition, they have better prediction performance and reproducibility than genes identified using alternatives. Conclusions The network contains important information on the functionality of genes. Incorporating the network structure can improve cancer marker identification.

  16. Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available Gene expression time course data can be used not only to detect differentially expressed genes but also to find temporal associations among genes. The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from transcriptomic data is addressed. A network reconstruction algorithm was developed that uses statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. The multinomial hypothesis testing-based network reconstruction allows for explicit specification of the false-positive rate, unique from all extant network inference algorithms. The method is superior to dynamic Bayesian network modeling in a simulation study. Temporal gene expression data from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol are used for modeling. Genes from major neuronal pathways are identified as putative components of the alcohol response mechanism. Nine of these genes have associations with alcohol reported in literature. Several other potentially relevant genes, compatible with independent results from literature mining, may play a role in the response to alcohol. Additional, previously unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.

  17. Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Lodowski Kerrie H

    2009-01-01

    Full Text Available Gene expression time course data can be used not only to detect differentially expressed genes but also to find temporal associations among genes. The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from transcriptomic data is addressed. A network reconstruction algorithm was developed that uses statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. The multinomial hypothesis testing-based network reconstruction allows for explicit specification of the false-positive rate, unique from all extant network inference algorithms. The method is superior to dynamic Bayesian network modeling in a simulation study. Temporal gene expression data from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol are used for modeling. Genes from major neuronal pathways are identified as putative components of the alcohol response mechanism. Nine of these genes have associations with alcohol reported in literature. Several other potentially relevant genes, compatible with independent results from literature mining, may play a role in the response to alcohol. Additional, previously unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.

  18. Unraveling toxicological mechanisms and predicting toxicity classes with gene dysregulation networks

    NARCIS (Netherlands)

    Pronk, T.E.; Someren, P. van; Stierum, R.H.; Ezendam, J.; Pennings, J.L.A.

    2013-01-01

    The use of genes for distinguishing classes of toxicity has become well established. In this paper we combine the reconstruction of a gene dysregulation network (GDN) with a classifier to assign unseen compounds to their appropriate class. Gene pairs in the GDN are dysregulated in the sense that the

  19. Identification of gene networks and pathways associated with Guillain-Barre syndrome.

    Directory of Open Access Journals (Sweden)

    Kuo-Hsuan Chang

    Full Text Available BACKGROUND: The underlying change of gene network expression of Guillain-Barré syndrome (GBS remains elusive. We sought to identify GBS-associated gene networks and signaling pathways by analyzing the transcriptional profile of leukocytes in the patients with GBS. METHODS AND FINDINGS: Quantitative global gene expression microarray analysis of peripheral blood leukocytes was performed on 7 patients with GBS and 7 healthy controls. Gene expression profiles were compared between patients and controls after standardization. The set of genes that significantly correlated with GBS was further analyzed by Ingenuity Pathways Analyses. 256 genes and 18 gene networks were significantly associated with GBS (fold change ≥2, P<0.05. FOS, PTGS2, HMGB2 and MMP9 are the top four of 246 significantly up-regulated genes. The most significant disease and altered biological function genes associated with GBS were those involved in inflammatory response, infectious disease, and respiratory disease. Cell death, cellular development and cellular movement were the top significant molecular and cellular functions involved in GBS. Hematological system development and function, immune cell trafficking and organismal survival were the most significant GBS-associated function in physiological development and system category. Several hub genes, such as MMP9, PTGS2 and CREB1 were identified in the associated gene networks. Canonical pathway analysis showed that GnRH, corticotrophin-releasing hormone and ERK/MAPK signaling were the most significant pathways in the up-regulated gene set in GBS. CONCLUSIONS: This study reveals the gene networks and canonical pathways associated with GBS. These data provide not only networks between the genes for understanding the pathogenic properties of GBS but also map significant pathways for the future development of novel therapeutic strategies.

  20. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures.

    Science.gov (United States)

    Kentzoglanakis, Kyriakos; Poole, Matthew

    2012-01-01

    In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.

  1. A parallel implementation of the network identification by multiple regression (NIR algorithm to reverse-engineer regulatory gene networks.

    Directory of Open Access Journals (Sweden)

    Francesco Gregoretti

    Full Text Available The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.

  2. Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Xiaodong Cai

    Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.

  3. NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data.

    Science.gov (United States)

    Xia, Jianguo; Gill, Erin E; Hancock, Robert E W

    2015-06-01

    Meta-analysis of gene expression data sets is increasingly performed to help identify robust molecular signatures and to gain insights into underlying biological processes. The complicated nature of such analyses requires both advanced statistics and innovative visualization strategies to support efficient data comparison, interpretation and hypothesis generation. NetworkAnalyst (http://www.networkanalyst.ca) is a comprehensive web-based tool designed to allow bench researchers to perform various common and complex meta-analyses of gene expression data via an intuitive web interface. By coupling well-established statistical procedures with state-of-the-art data visualization techniques, NetworkAnalyst allows researchers to easily navigate large complex gene expression data sets to determine important features, patterns, functions and connections, thus leading to the generation of new biological hypotheses. This protocol provides a step-wise description of how to effectively use NetworkAnalyst to perform network analysis and visualization from gene lists; to perform meta-analysis on gene expression data while taking into account multiple metadata parameters; and, finally, to perform a meta-analysis of multiple gene expression data sets. NetworkAnalyst is designed to be accessible to biologists rather than to specialist bioinformaticians. The complete protocol can be executed in ∼1.5 h. Compared with other similar web-based tools, NetworkAnalyst offers a unique visual analytics experience that enables data analysis within the context of protein-protein interaction networks, heatmaps or chord diagrams. All of these analysis methods provide the user with supporting statistical and functional evidence.

  4. Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Benjamin A Logsdon

    Full Text Available Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL, which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1, and genes involved in endocytosis (RCY1, the spindle checkpoint (BUB2, sulfonate catabolism (JLP1, and cell-cell communication (PRM7. Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.

  5. Massive-scale gene co-expression network construction and robustness testing using random matrix theory.

    Science.gov (United States)

    Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.

  6. Massive-scale gene co-expression network construction and robustness testing using random matrix theory.

    Directory of Open Access Journals (Sweden)

    Scott M Gibson

    Full Text Available The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT, is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens, rice (Oryza sativa and budding yeast (Saccharomyces cerevisiae. We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.

  7. Looking at the origin of phenotypic variation from pattern formation gene networks

    Indian Academy of Sciences (India)

    Isaac Salazar-Ciudad

    2009-10-01

    This article critically reviews some widespread views about the overall functioning of development. Special attention is devoted to views in developmental genetics about the superstructure of developmental gene networks. According to these views gene networks are hierarchic and multilayered. The highest layers partition the embryo in large coarse areas and control downstream genes that subsequently subdivide the embryo into smaller and smaller areas. These views are criticized on the bases of developmental and evolutionary arguments. First, these views, although detailed at the level of gene identities, do not incorporate morphogenetic mechanisms nor do they try to explain how morphology changes during development. Often, they assume that morphogenetic mechanisms are subordinate to cell signaling events. This is in contradiction to the evidence reviewed herein. Experimental evidence on pattern formation also contradicts the view that developmental gene networks are hierarchically multilayered and that their functioning is decodable from promoter analysis. Simple evolutionary arguments suggest that, indeed, developmental gene networks tend to be non-hierarchic. Re-use leads to extensive modularity in gene networks while developmental drift blurs this modularity. Evolutionary opportunism makes developmental gene networks very dependent on epigenetic factors.

  8. Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks.

    Science.gov (United States)

    Zhu, Shijia; Wang, Yadong

    2015-12-18

    Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is 'stationarity', and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.

  9. Ancient Astronomy in Armenia

    Science.gov (United States)

    Parsamian, Elma S.

    2007-08-01

    The most important discovery, which enriched our knowledge of ancient astronomy in Armenia, was the complex of platforms for astronomical observations on the Small Hill of Metzamor, which may be called an ancient “observatory”. Investigations on that Hill show that the ancient inhabitants of the Armenian Highlands have left us not only pictures of celestial bodies, but a very ancient complex of platforms for observing the sky. Among the ancient monuments in Armenia there is a megalithic monument, probably, being connected with astronomy. 250km South-East of Yerevan there is a structure Zorats Kar (Karahunge) dating back to II millennium B.C. Vertical megaliths many of which are more than two meters high form stone rings resembling ancient stone monuments - henges in Great Britain and Brittany. Medieval observations of comets and novas by data in ancient Armenian manuscripts are found. In the collection of ancient Armenian manuscripts (Matenadaran) in Yerevan there are many manuscripts with information about observations of astronomical events as: solar and lunar eclipses, comets and novas, bolides and meteorites etc. in medieval Armenia.

  10. Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives.

    Science.gov (United States)

    Warmflash, Aryeh; Francois, Paul; Siggia, Eric D

    2012-10-01

    The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input-output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria.

  11. Digital Signal Processing and Control for the Study of Gene Networks

    Science.gov (United States)

    Shin, Yong-Jun

    2016-04-01

    Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.

  12. On the Interplay between Entropy and Robustness of Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Bor-Sen Chen

    2010-05-01

    Full Text Available The interplay between entropy and robustness of gene network is a core mechanism of systems biology. The entropy is a measure of randomness or disorder of a physical system due to random parameter fluctuation and environmental noises in gene regulatory networks. The robustness of a gene regulatory network, which can be measured as the ability to tolerate the random parameter fluctuation and to attenuate the effect of environmental noise, will be discussed from the robust H∞ stabilization and filtering perspective. In this review, we will also discuss their balancing roles in evolution and potential applications in systems and synthetic biology.

  13. Digital Signal Processing and Control for the Study of Gene Networks.

    Science.gov (United States)

    Shin, Yong-Jun

    2016-04-22

    Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.

  14. Integrative analysis for finding genes and networks involved in diabetes and other complex diseases

    DEFF Research Database (Denmark)

    Bergholdt, R.; Størling, Zenia, Marian; Hansen, Kasper Lage;

    2007-01-01

    identified a number of new protein network modules and novel candidate genes/proteins for type 1 diabetes. We propose this type of integrative analysis as a general method for the elucidation of genes and networks involved in diabetes and other complex diseases.......We have developed an integrative analysis method combining genetic interactions, identified using type 1 diabetes genome scan data, and a high-confidence human protein interaction network. Resulting networks were ranked by the significance of the enrichment of proteins from interacting regions. We...

  15. Refining ensembles of predicted gene regulatory networks based on characteristic interaction sets.

    Directory of Open Access Journals (Sweden)

    Lukas Windhager

    Full Text Available Different ensemble voting approaches have been successfully applied for reverse-engineering of gene regulatory networks. They are based on the assumption that a good approximation of true network structure can be derived by considering the frequencies of individual interactions in a large number of predicted networks. Such approximations are typically superior in terms of prediction quality and robustness as compared to considering a single best scoring network only. Nevertheless, ensemble approaches only work well if the predicted gene regulatory networks are sufficiently similar to each other. If the topologies of predicted networks are considerably different, an ensemble of all networks obscures interesting individual characteristics. Instead, networks should be grouped according to local topological similarities and ensemble voting performed for each group separately. We argue that the presence of sets of co-occurring interactions is a suitable indicator for grouping predicted networks. A stepwise bottom-up procedure is proposed, where first mutual dependencies between pairs of interactions are derived from predicted networks. Pairs of co-occurring interactions are subsequently extended to derive characteristic interaction sets that distinguish groups of networks. Finally, ensemble voting is applied separately to the resulting topologically similar groups of networks to create distinct group-ensembles. Ensembles of topologically similar networks constitute distinct hypotheses about the reference network structure. Such group-ensembles are easier to interpret as their characteristic topology becomes clear and dependencies between interactions are known. The availability of distinct hypotheses facilitates the design of further experiments to distinguish between plausible network structures. The proposed procedure is a reasonable refinement step for non-deterministic reverse-engineering applications that produce a large number of candidate

  16. Reconstruction of large-scale gene regulatory networks using Bayesian model averaging.

    Science.gov (United States)

    Kim, Haseong; Gelenbe, Erol

    2012-09-01

    Gene regulatory networks provide the systematic view of molecular interactions in a complex living system. However, constructing large-scale gene regulatory networks is one of the most challenging problems in systems biology. Also large burst sets of biological data require a proper integration technique for reliable gene regulatory network construction. Here we present a new reverse engineering approach based on Bayesian model averaging which attempts to combine all the appropriate models describing interactions among genes. This Bayesian approach with a prior based on the Gibbs distribution provides an efficient means to integrate multiple sources of biological data. In a simulation study with maximum of 2000 genes, our method shows better sensitivity than previous elastic-net and Gaussian graphical models, with a fixed specificity of 0.99. The study also shows that the proposed method outperforms the other standard methods for a DREAM dataset generated by nonlinear stochastic models. In brain tumor data analysis, three large-scale networks consisting of 4422 genes were built using the gene expression of non-tumor, low and high grade tumor mRNA expression samples, along with DNA-protein binding affinity information. We found that genes having a large variation of degree distribution among the three tumor networks are the ones that see most involved in regulatory and developmental processes, which possibly gives a novel insight concerning conventional differentially expressed gene analysis.

  17. Predicting Variabilities in Cardiac Gene Expression with a Boolean Network Incorporating Uncertainty.

    Science.gov (United States)

    Grieb, Melanie; Burkovski, Andre; Sträng, J Eric; Kraus, Johann M; Groß, Alexander; Palm, Günther; Kühl, Michael; Kestler, Hans A

    2015-01-01

    Gene interactions in cells can be represented by gene regulatory networks. A Boolean network models gene interactions according to rules where gene expression is represented by binary values (on / off or {1, 0}). In reality, however, the gene's state can have multiple values due to biological properties. Furthermore, the noisy nature of the experimental design results in uncertainty about a state of the gene. Here we present a new Boolean network paradigm to allow intermediate values on the interval [0, 1]. As in the Boolean network, fixed points or attractors of such a model correspond to biological phenotypes or states. We use our new extension of the Boolean network paradigm to model gene expression in first and second heart field lineages which are cardiac progenitor cell populations involved in early vertebrate heart development. By this we are able to predict additional biological phenotypes that the Boolean model alone is not able to identify without utilizing additional biological knowledge. The additional phenotypes predicted by the model were confirmed by published biological experiments. Furthermore, the new method predicts gene expression propensities for modelled but yet to be analyzed genes.

  18. Understanding the Role of Housekeeping and Stress-Related Genes in Transcription-Regulatory Networks

    Science.gov (United States)

    Heath, Allison; Kavraki, Lydia; Balázsi, Gábor

    2008-03-01

    Despite the increasing number of completely sequenced genomes, much remains to be learned about how living cells process environmental information and respond to changes in their surroundings. Accumulating evidence indicates that eukaryotic and prokaryotic genes can be classified in two distinct categories that we will call class I and class II. Class I genes are housekeeping genes, often characterized by stable, noise resistant expression levels. In contrast, class II genes are stress-related genes and often have noisy, unstable expression levels. In this work we analyze the large scale transcription-regulatory networks (TRN) of E. coli and S. cerevisiae and preliminary data on H. sapien. We find that stable, housekeeping genes (class I) are preferentially utilized as transcriptional inputs while stress related, unstable genes (class II) are utilized as transcriptional integrators. This might be the result of convergent evolution that placed the appropriate genes in the appropriate locations within transcriptional networks according to some fundamental principles that govern cellular information processing.

  19. Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Song, Mingzhou (Joe) [New Mexico State University, Las Cruces; Lewis, Chris K. [New Mexico State University, Las Cruces; Lance, Eric [New Mexico State University, Las Cruces; Chesler, Elissa J [ORNL; Kirova, Roumyana [Bristol-Myers Squibb Pharmaceutical Research & Development, NJ; Langston, Michael A [University of Tennessee, Knoxville (UTK); Bergeson, Susan [Texas Tech University, Lubbock

    2009-01-01

    The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from high-throughput transcriptomic data is addressed. A network reconstruction algorithm was developed that uses the statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. Using temporal gene expression data collected from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol, this algorithm identified genes from a major neuronal pathway as putative components of the alcohol response mechanism. Three of these genes have known associations with alcohol in the literature. Several other potentially relevant genes, highlighted and agreeing with independent results from literature mining, may play a role in the response to alcohol. Additional, previously-unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.

  20. Genes and Gene Networks Involved in Sodium Fluoride-Elicited Cell Death Accompanying Endoplasmic Reticulum Stress in Oral Epithelial Cells

    Directory of Open Access Journals (Sweden)

    Yoshiaki Tabuchi

    2014-05-01

    Full Text Available Here, to understand the molecular mechanisms underlying cell death induced by sodium fluoride (NaF, we analyzed gene expression patterns in rat oral epithelial ROE2 cells exposed to NaF using global-scale microarrays and bioinformatics tools. A relatively high concentration of NaF (2 mM induced cell death concomitant with decreases in mitochondrial membrane potential, chromatin condensation and caspase-3 activation. Using 980 probe sets, we identified 432 up-regulated and 548 down-regulated genes, that were differentially expressed by >2.5-fold in the cells treated with 2 mM of NaF and categorized them into 4 groups by K-means clustering. Ingenuity® pathway analysis revealed several gene networks from gene clusters. The gene networks Up-I and Up-II included many up-regulated genes that were mainly associated with the biological function of induction or prevention of cell death, respectively, such as Atf3, Ddit3 and Fos (for Up-I and Atf4 and Hspa5 (for Up-II. Interestingly, knockdown of Ddit3 and Hspa5 significantly increased and decreased the number of viable cells, respectively. Moreover, several endoplasmic reticulum (ER stress-related genes including, Ddit3, Atf4 and Hapa5, were observed in these gene networks. These findings will provide further insight into the molecular mechanisms of NaF-induced cell death accompanying ER stress in oral epithelial cells.

  1. A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

    Science.gov (United States)

    Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar; Otim, Ochan; Brown, C. Titus; Livi, Carolina B.; Lee, Pei Yun; Revilla, Roger; Schilstra, Maria J.; Clarke, Peter J C.; Rust, Alistair G.; Pan, Zhengjun; Arnone, Maria I.; Rowen, Lee; Cameron, R. Andrew; McClay, David R.; Hood, Leroy; Bolouri, Hamid

    2002-01-01

    We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of

  2. Network statistics of genetically-driven gene co-expression modules in mouse crosses

    Directory of Open Access Journals (Sweden)

    Marie-Pier eScott-Boyer

    2013-12-01

    Full Text Available In biology, networks are used in different contexts as ways to represent relationships between entities, such as for instance interactions between genes, proteins or metabolites. Despite progress in the analysis of such networks and their potential to better understand the collective impact of genes on complex traits, one remaining challenge is to establish the biologic validity of gene co-expression networks and to determine what governs their organization. We used WGCNA to construct and analyze seven gene expression datasets from several tissues of mouse recombinant inbred strains (RIS. For six out of the 7 networks, we found that linkage to module QTLs (mQTLs could be established for 29.3% of gene co-expression modules detected in the several mouse RIS. For about 74.6% of such genetically-linked modules, the mQTL was on the same chromosome as the one contributing most genes to the module, with genes originating from that chromosome showing higher connectivity than other genes in the modules. Such modules (that we considered as genetically-driven had network statistic properties (density, centralization and heterogeneity that set them apart from other modules in the network. Altogether, a sizeable portion of gene co-expression modules detected in mouse RIS panels had genetic determinants as their main organizing principle. In addition to providing a biologic interpretation validation for these modules, these genetic determinants imparted on them particular properties that set them apart from other modules in the network, to the point that they can be predicted to a large extent on the basis of their network statistics.

  3. Experimental design for parameter estimation of gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Bernhard Steiert

    Full Text Available Systems biology aims for building quantitative models to address unresolved issues in molecular biology. In order to describe the behavior of biological cells adequately, gene regulatory networks (GRNs are intensively investigated. As the validity of models built for GRNs depends crucially on the kinetic rates, various methods have been developed to estimate these parameters from experimental data. For this purpose, it is favorable to choose the experimental conditions yielding maximal information. However, existing experimental design principles often rely on unfulfilled mathematical assumptions or become computationally demanding with growing model complexity. To solve this problem, we combined advanced methods for parameter and uncertainty estimation with experimental design considerations. As a showcase, we optimized three simulated GRNs in one of the challenges from the Dialogue for Reverse Engineering Assessment and Methods (DREAM. This article presents our approach, which was awarded the best performing procedure at the DREAM6 Estimation of Model Parameters challenge. For fast and reliable parameter estimation, local deterministic optimization of the likelihood was applied. We analyzed identifiability and precision of the estimates by calculating the profile likelihood. Furthermore, the profiles provided a way to uncover a selection of most informative experiments, from which the optimal one was chosen using additional criteria at every step of the design process. In conclusion, we provide a strategy for optimal experimental design and show its successful application on three highly nonlinear dynamic models. Although presented in the context of the GRNs to be inferred for the DREAM6 challenge, the approach is generic and applicable to most types of quantitative models in systems biology and other disciplines.

  4. The Max-Min High-Order Dynamic Bayesian Network for Learning Gene Regulatory Networks with Time-Delayed Regulations.

    Science.gov (United States)

    Li, Yifeng; Chen, Haifen; Zheng, Jie; Ngom, Alioune

    2016-01-01

    Accurately reconstructing gene regulatory network (GRN) from gene expression data is a challenging task in systems biology. Although some progresses have been made, the performance of GRN reconstruction still has much room for improvement. Because many regulatory events are asynchronous, learning gene interactions with multiple time delays is an effective way to improve the accuracy of GRN reconstruction. Here, we propose a new approach, called Max-Min high-order dynamic Bayesian network (MMHO-DBN) by extending the Max-Min hill-climbing Bayesian network technique originally devised for learning a Bayesian network's structure from static data. Our MMHO-DBN can explicitly model the time lags between regulators and targets in an efficient manner. It first uses constraint-based ideas to limit the space of potential structures, and then applies search-and-score ideas to search for an optimal HO-DBN structure. The performance of MMHO-DBN to GRN reconstruction was evaluated using both synthetic and real gene expression time-series data. Results show that MMHO-DBN is more accurate than current time-delayed GRN learning methods, and has an intermediate computing performance. Furthermore, it is able to learn long time-delayed relationships between genes. We applied sensitivity analysis on our model to study the performance variation along different parameter settings. The result provides hints on the setting of parameters of MMHO-DBN.

  5. Analysis of regulatory networks constructed based on gene coexpression in pituitary adenoma

    Indian Academy of Sciences (India)

    Jie Gong; Bo Diao; Guo Jie Yao; Ying Liu; Guo Zheng Xu

    2013-12-01

    Gene coexpression patterns can reveal gene collections with functional consistency. This study systematically constructs regulatory networks for pituitary tumours by integrating gene coexpression, transcriptional and posttranscriptional regulation. Through network analysis, we elaborate the incidence mechanism of pituitary adenoma. The Pearson’s correlation coefficient was utilized to calculate the level of gene coexpression. By comparing pituitary adenoma samples with normal samples, pituitary adenoma-specific gene coexpression patterns were identified. For pituitary adenoma-specific coexpressed genes, we integrated transcription factor (TF) and microRNA (miRNA) regulation to construct a complex regulatory network from the transcriptional and posttranscriptional perspectives. Network module analysis identified the synergistic regulation of genes by miRNAs and TFs in pituitary adenoma. We identified 142 pituitary adenoma-specific active genes, including 43 TFs and 99 target genes of TFs. Functional enrichment of these 142 genes revealed that the occurrence of pituitary adenoma induced abnormalities in intracellular metabolism and angiogenesis process. These 142 genes were also significantly enriched in adenoma pathway. Module analysis of the systematic regulatory network found that three modules contained elements that were closely related to pituitary adenoma, such as FGF2 and SP1, as well as transcription factors and miRNAs involved in the tumourigenesis. These results show that in the occurrence of pituitary adenoma, miRNA, TF and genes interact with each other. Based on gene expression, the proposed method integrates interaction information from different levels and systematically explains the occurrence of pituitary tumours. It facilitates the tracing of the origin of the disease and can provide basis for early diagnosis of complex diseases or cancer without obvious symptoms.

  6. Analysis of regulatory networks constructed based on gene coexpression in pituitary adenoma.

    Science.gov (United States)

    Gong, Jie; Diao, Bo; Yao, Guo Jie; Liu, Ying; Xu, Guo Zheng

    2013-12-01

    Gene coexpression patterns can reveal gene collections with functional consistency. This study systematically constructs regulatory networks for pituitary tumours by integrating gene coexpression, transcriptional and posttranscriptional regulation. Through network analysis, we elaborate the incidence mechanism of pituitary adenoma. The Pearson's correlation coefficient was utilized to calculate the level of gene coexpression. By comparing pituitary adenoma samples with normal samples, pituitary adenoma-specific gene coexpression patterns were identified. For pituitary adenoma-specific coexpressed genes, we integrated transcription factor (TF) and microRNA (miRNA) regulation to construct a complex regulatory network from the transcriptional and posttranscriptional perspectives. Network module analysis identified the synergistic regulation of genes by miRNAs and TFs in pituitary adenoma. We identified 142 pituitary adenoma-specific active genes, including 43 TFs and 99 target genes of TFs. Functional enrichment of these 142 genes revealed that the occurrence of pituitary adenoma induced abnormalities in intracellular metabolism and angiogenesis process. These 142 genes were also significantly enriched in adenoma pathway. Module analysis of the systematic regulatory network found that three modules contained elements that were closely related to pituitary adenoma, such as FGF2 and SP1, as well as transcription factors and miRNAs involved in the tumourigenesis. These results show that in the occurrence of pituitary adenoma, miRNA, TF and genes interact with each other. Based on gene expression, the proposed method integrates interaction information from different levels and systematically explains the occurrence of pituitary tumours. It facilitates the tracing of the origin of the disease and can provide basis for early diagnosis of complex diseases or cancer without obvious symptoms.

  7. Transcriptomic network analysis of micronuclei-related genes: a case study

    DEFF Research Database (Denmark)

    van Leeuwen, D. M.; Pedersen, Marie; Knudsen, Lisbeth E.;

    2011-01-01

    checkpoint and aneuploidy. The MN-related gene network was tested against a transcriptomics case study associated with MN measurements. In this case study, transcriptomic data from children and adults differentially exposed to ambient air pollution in the Czech Republic were analysed and visualised......Mechanistically relevant information on responses of humans to xenobiotic exposure in relation to chemically induced biological effects, such as micronuclei (MN) formation can be obtained through large-scale transcriptomics studies. Network analysis may enhance the analysis and visualisation...... of such data. Therefore, this study aimed to develop a 'MN formation' network based on a priori knowledge, by using the pathway tool MetaCore. The gene network contained 27 genes and three gene complexes that are related to processes involved in MN formation, e.g. spindle assembly checkpoint, cell cycle...

  8. New insight into genes in association with asthma: literature-based mining and network centrality analysis

    Institute of Scientific and Technical Information of China (English)

    LIANG Rui; WANG Lei; WANG Gang

    2013-01-01

    Background Asthma is a heterogeneous disease for which a strong genetic basis has been firmly established.Until now no studies have been undertaken to systemically explore the network of asthma-related genes using an internally developed literature-based discovery approach.This study was to explore asthma-related genes by using literaturebased mining and network centrality analysis.Methods Literature involving asthma-related genes were searched in PubMed from 2001 to 2011.Integration of natural language processing with network centrality analysis was used to identify asthma susceptibility genes and their interaction network.Asthma susceptibility genes were classified into three functional groups by gene ontology (GO) analysis and the key genes were confirmed by establishing asthma-related networks and pathways.Results Three hundred and twenty-six genes related with asthma such as IGHE (IgE),interleukin (IL)-4,5,6,10,13,17A,and tumor necrosis factor (TNF)-alpha were identified.GO analysis indicated some biological processes (developmental processes,signal transduction,death,etc.),cellular components (non-structural extracellular,plasma membrane and extracellular matrix),and molecular functions (signal transduction activity) that were involved in asthma.Furthermore,22 asthma-related pathways such as the Toll-like receptor signaling pathway,hematopoietic cell lineage,JAK-STAT signaling pathway,chemokine signaling pathway,and cytokine-cytokine receptor interaction,and 17 hub genes,such as JAK3,CCR1-3,CCR5-7,CCR8,were found.Conclusions Our study provides a remarkably detailed and comprehensive picture of asthma susceptibility genes and their interacting network.Further identification of these genes and molecular pathways may play a prominent role in establishing rational therapeutic approaches for asthma.

  9. Neural network predicts sequence of TP53 gene based on DNA chip

    DEFF Research Database (Denmark)

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero...... and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence....

  10. Effective Boolean dynamics analysis to identify functionally important genes in large-scale signaling networks.

    Science.gov (United States)

    Trinh, Hung-Cuong; Kwon, Yung-Keun

    2015-11-01

    Efficiently identifying functionally important genes in order to understand the minimal requirements of normal cellular development is challenging. To this end, a variety of structural measures have been proposed and their effectiveness has been investigated in recent literature; however, few studies have shown the effectiveness of dynamics-based measures. This led us to investigate a dynamic measure to identify functionally important genes, and the effectiveness of which was verified through application on two large-scale human signaling networks. We specifically consider Boolean sensitivity-based dynamics against an update-rule perturbation (BSU) as a dynamic measure. Through investigations on two large-scale human signaling networks, we found that genes with relatively high BSU values show slower evolutionary rate and higher proportions of essential genes and drug targets than other genes. Gene-ontology analysis showed clear differences between the former and latter groups of genes. Furthermore, we compare the identification accuracies of essential genes and drug targets via BSU and five well-known structural measures. Although BSU did not always show the best performance, it effectively identified the putative set of genes, which is significantly different from the results obtained via the structural measures. Most interestingly, BSU showed the highest synergy effect in identifying the functionally important genes in conjunction with other measures. Our results imply that Boolean-sensitive dynamics can be used as a measure to effectively identify functionally important genes in signaling networks.

  11. NetDiff - Bayesian model selection for differential gene regulatory network inference.

    Science.gov (United States)

    Thorne, Thomas

    2016-12-16

    Differential networks allow us to better understand the changes in cellular processes that are exhibited in conditions of interest, identifying variations in gene regulation or protein interaction between, for example, cases and controls, or in response to external stimuli. Here we present a novel methodology for the inference of differential gene regulatory networks from gene expression microarray data. Specifically we apply a Bayesian model selection approach to compare models of conserved and varying network structure, and use Gaussian graphical models to represent the network structures. We apply a variational inference approach to the learning of Gaussian graphical models of gene regulatory networks, that enables us to perform Bayesian model selection that is significantly more computationally efficient than Markov Chain Monte Carlo approaches. Our method is demonstrated to be more robust than independent analysis of data from multiple conditions when applied to synthetic network data, generating fewer false positive predictions of differential edges. We demonstrate the utility of our approach on real world gene expression microarray data by applying it to existing data from amyotrophic lateral sclerosis cases with and without mutations in C9orf72, and controls, where we are able to identify differential network interactions for further investigation.

  12. Discovering hidden relationships between renal diseases and regulated genes through 3D network visualizations

    Directory of Open Access Journals (Sweden)

    Bhavnani Suresh K

    2010-11-01

    Full Text Available Abstract Background In a recent study, two-dimensional (2D network layouts were used to visualize and quantitatively analyze the relationship between chronic renal diseases and regulated genes. The results revealed complex relationships between disease type, gene specificity, and gene regulation type, which led to important insights about the underlying biological pathways. Here we describe an attempt to extend our understanding of these complex relationships by reanalyzing the data using three-dimensional (3D network layouts, displayed through 2D and 3D viewing methods. Findings The 3D network layout (displayed through the 3D viewing method revealed that genes implicated in many diseases (non-specific genes tended to be predominantly down-regulated, whereas genes regulated in a few diseases (disease-specific genes tended to be up-regulated. This new global relationship was quantitatively validated through comparison to 1000 random permutations of networks of the same size and distribution. Our new finding appeared to be the result of using specific features of the 3D viewing method to analyze the 3D renal network. Conclusions The global relationship between gene regulation and gene specificity is the first clue from human studies that there exist common mechanisms across several renal diseases, which suggest hypotheses for the underlying mechanisms. Furthermore, the study suggests hypotheses for why the 3D visualization helped to make salient a new regularity that was difficult to detect in 2D. Future research that tests these hypotheses should enable a more systematic understanding of when and how to use 3D network visualizations to reveal complex regularities in biological networks.

  13. Interactogeneous: disease gene prioritization using heterogeneous networks and full topology scores.

    Directory of Open Access Journals (Sweden)

    Joana P Gonçalves

    Full Text Available Disease gene prioritization aims to suggest potential implications of genes in disease susceptibility. Often accomplished in a guilt-by-association scheme, promising candidates are sorted according to their relatedness to known disease genes. Network-based methods have been successfully exploiting this concept by capturing the interaction of genes or proteins into a score. Nonetheless, most current approaches yield at least some of the following limitations: (1 networks comprise only curated physical interactions leading to poor genome coverage and density, and bias toward a particular source; (2 scores focus on adjacencies (direct links or the most direct paths (shortest paths within a constrained neighborhood around the disease genes, ignoring potentially informative indirect paths; (3 global clustering is widely applied to partition the network in an unsupervised manner, attributing little importance to prior knowledge; (4 confidence weights and their contribution to edge differentiation and ranking reliability are often disregarded. We hypothesize that network-based prioritization related to local clustering on graphs and considering full topology of weighted gene association networks integrating heterogeneous sources should overcome the above challenges. We term such a strategy Interactogeneous. We conducted cross-validation tests to assess the impact of network sources, alternative path inclusion and confidence weights on the prioritization of putative genes for 29 diseases. Heat diffusion ranking proved the best prioritization method overall, increasing the gap to neighborhood and shortest paths scores mostly on single source networks. Heterogeneous associations consistently delivered superior performance over single source data across the majority of methods. Results on the contribution of confidence weights were inconclusive. Finally, the best Interactogeneous strategy, heat diffusion ranking and associations from the STRING database

  14. Long-term oil contamination alters the molecular ecological networks of soil microbial functional genes

    Directory of Open Access Journals (Sweden)

    Yuting eLiang

    2016-02-01

    Full Text Available With knowledge on microbial composition and diversity, investigation of within-community interactions is a further step to elucidate microbial ecological functions, such as the biodegradation of hazardous contaminants. In this work, microbial functional molecular ecological networks were studied in both contaminated and uncontaminated soils to determine the possible influences of oil contamination on microbial interactions and potential functions. Soil samples were obtained from an oil-exploring site located in South China, and the microbial functional genes were analyzed with GeoChip, a high-throughput functional microarray. By building random networks based on null model, we demonstrated that overall network structures and properties were significantly different between contaminated and uncontaminated soils (P < 0.001. Network connectivity, module numbers, and modularity were all reduced with contamination. Moreover, the topological roles of the genes (module hub and connectors were altered with oil contamination. Subnetworks of genes involved in alkane and polycyclic aromatic hydrocarbon degradation were also constructed. Negative co-occurrence patterns prevailed among functional genes, thereby indicating probable competition relationships. The potential keystone genes, defined as either hubs or genes with highest connectivities in the network, were further identified. The network constructed in this study predicted the potential effects of anthropogenic contamination on microbial community co-occurrence interactions.

  15. Intracompartmental and intercompartmental transcriptional networks coordinate the expression of genes for organellar functions.

    Science.gov (United States)

    Leister, Dario; Wang, Xi; Haberer, Georg; Mayer, Klaus F X; Kleine, Tatjana

    2011-09-01

    Genes for mitochondrial and chloroplast proteins are distributed between the nuclear and organellar genomes. Organelle biogenesis and metabolism, therefore, require appropriate coordination of gene expression in the different compartments to ensure efficient synthesis of essential multiprotein complexes of mixed genetic origin. Whereas organelle-to-nucleus signaling influences nuclear gene expression at the transcriptional level, organellar gene expression (OGE) is thought to be primarily regulated posttranscriptionally. Here, we show that intracompartmental and intercompartmental transcriptional networks coordinate the expression of genes for organellar functions. Nearly 1,300 ATH1 microarray-based transcriptional profiles of nuclear and organellar genes for mitochondrial and chloroplast proteins in the model plant Arabidopsis (Arabidopsis thaliana) were analyzed. The activity of genes involved in organellar energy production (OEP) or OGE in each of the organelles and in the nucleus is highly coordinated. Intracompartmental networks that link the OEP and OGE gene sets serve to synchronize the expression of nucleus- and organelle-encoded proteins. At a higher regulatory level, coexpression of organellar and nuclear OEP/OGE genes typically modulates chloroplast functions but affects mitochondria only when chloroplast functions are perturbed. Under conditions that induce energy shortage, the intercompartmental coregulation of photosynthesis genes can even override intracompartmental networks. We conclude that dynamic intracompartmental and intercompartmental transcriptional networks for OEP and OGE genes adjust the activity of organelles in response to the cellular energy state and environmental stresses, and we identify candidate cis-elements involved in the transcriptional coregulation of nuclear genes. Regarding the transcriptional regulation of chloroplast genes, novel tentative target genes of σ factors are identified.

  16. Prediction of key genes in ovarian cancer treated with decitabine based on network strategy.

    Science.gov (United States)

    Wang, Yu-Zhen; Qiu, Sheng-Chun

    2016-06-01

    The objective of the present study was to predict key genes in ovarian cancer before and after treatment with decitabine utilizing a network approach and to reveal the molecular mechanism. Pathogenic networks of ovarian cancer before and after treatment were identified based on known pathogenic genes (seed genes) and differentially expressed genes (DEGs) detected by Significance Analysis of Microarrays (SAM) method. A weight was assigned to each gene in the pathogenic network and then candidate genes were evaluated. Topological properties (degree, betweenness, closeness and stress) of candidate genes were analyzed to investigate more confident pathogenic genes. Pathway enrichment analysis for candidate and seed genes were conducted. Validation of candidate gene expression in ovarian cancer was performed by reverse transcriptase-polymerase chain reaction (RT-PCR) assays. There were 73 nodes and 147 interactions in the pathogenic network before treatment, while 47 nodes and 66 interactions after treatment. A total of 32 candidate genes were identified in the before treatment group of ovarian cancer, of which 16 were rightly candidate genes after treatment and the others were silenced. We obtained 5 key genes (PIK3R2, CCNB1, IL2, IL1B and CDC6) for decitabine treatment that were validated by RT-PCR. In conclusion, we successfully identified 5 key genes (PIK3R2, CCNB1, IL2, IL1B and CDC6) and validated them, which provides insight into the molecular mechanisms of decitabine treatment and may be potential pathogenic biomarkers for the therapy of ovarian cancer.

  17. PyPanda: a Python package for gene regulatory network reconstruction

    OpenAIRE

    van IJzendoorn, David G. P.; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L

    2016-01-01

    Summary: PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of ‘omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. Availability and implementation: The open source PyPanda Python package is freely a...

  18. Gene regulatory network inference using fused LASSO on multiple data sets.

    Science.gov (United States)

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M O; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-02-11

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.

  19. Medusa structure of the gene regulatory network: dominance of transcription factors in cancer subtype classification.

    Science.gov (United States)

    Guo, Yuchun; Feng, Ying; Trivedi, Niraj S; Huang, Sui

    2011-05-01

    Gene expression profiles consisting of ten thousands of transcripts are used for clustering of tissue, such as tumors, into subtypes, often without considering the underlying reason that the distinct patterns of expression arise because of constraints in the realization of gene expression profiles imposed by the gene regulatory network. The topology of this network has been suggested to consist of a regulatory core of genes represented most prominently by transcription factors (TFs) and microRNAs, that influence the expression of other genes, and of a periphery of 'enslaved' effector genes that are regulated but not regulating. This 'medusa' architecture implies that the core genes are much stronger determinants of the realized gene expression profiles. To test this hypothesis, we examined the clustering of gene expression profiles into known tumor types to quantitatively demonstrate that TFs, and even more pronounced, microRNAs, are much stronger discriminators of tumor type specific gene expression patterns than a same number of randomly selected or metabolic genes. These findings lend support to the hypothesis of a medusa architecture and of the canalizing nature of regulation by microRNAs. They also reveal the degree of freedom for the expression of peripheral genes that are less stringently associated with a tissue type specific global gene expression profile.

  20. Ancient Marital Rites

    Institute of Scientific and Technical Information of China (English)

    1997-01-01

    Clearly defined rites governing speech and actions dominated both the social and domestic activities of ancient Chinese people. Rites not only dominated the lives of men, but were also prominent in the lives of women.

  1. Ancient Chinese Architecture

    Institute of Scientific and Technical Information of China (English)

    1993-01-01

    CHINESE people have accu-mulated a great deal ofexperience in architecture,constantly improving building ma-terials and thus creating uniquebuilding styles.The history of ancient Chinesearchitechtural development can be

  2. Regulatory network analysis of microRNAs and genes in imatinib-resistant chronic myeloid leukemia.

    Science.gov (United States)

    Soltani, Ismael; Gharbi, Hanen; Hassine, Islem Ben; Bouguerra, Ghada; Douzi, Kais; Teber, Mouheb; Abbes, Salem; Menif, Samia

    2016-09-16

    Targeted therapy in the form of selective breakpoint cluster region-abelson (BCR/ABL) tyrosine kinase inhibitor (imatinib mesylate) has successfully been introduced in the treatment of the chronic myeloid leukemia (CML). However, acquired resistance against imatinib mesylate (IM) has been reported in nearly half of patients and has been recognized as major issue in clinical practice. Multiple resistance genes and microRNAs (miRNAs) are thought to be involved in the IM resistance process. These resistance genes and miRNAs tend to interact with each other through a regulatory network. Therefore, it is crucial to study the impact of these interactions in the IM resistance process. The present study focused on miRNA and gene network analysis in order to elucidate the role of interacting elements and to understand their functional contribution in therapeutic failure. Unlike previous studies which were centered only on genes or miRNAs, the prime focus of the present study was on relationships. To this end, three regulatory networks including differentially expressed, related, and global networks were constructed and analyzed in search of similarities and differences. Regulatory associations between miRNAs and their target genes, transcription factors and miRNAs, as well as miRNAs and their host genes were also macroscopically investigated. Certain key pathways in the three networks, especially in the differentially expressed network, were featured. The differentially expressed network emerged as a fault map of IM-resistant CML. Theoretically, the IM resistance process could be prevented by correcting the included errors. The present network-based approach to study resistance miRNAs and genes might help in understanding the molecular mechanisms of IM resistance in CML as well as in the improvement of CML therapy.

  3. iRegulon: from a gene list to a gene regulatory network using large motif and track collections.

    Directory of Open Access Journals (Sweden)

    Rekin's Janky

    2014-07-01

    Full Text Available Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org.

  4. Comparison of gene regulatory networks of benign and malignant breast cancer samples with normal samples.

    Science.gov (United States)

    Chen, D B; Yang, H J

    2014-11-11

    The aim of this study was to explain the pathogenesis and deterioration process of breast cancer. Breast cancer expression profile data GSE27567 was downloaded from the Gene Expression Omnibus (GEO) database, and breast cancer-related genes were extracted from databases, including Cancer-Resource and Online Mendelian Inheritance In Man (OMIM). Next, h17 transcription factor data were obtained from the University of California, Santa Cruz. Database for Annotation, Visualization, and Integrated Discovery (DAVID)-enrichment analysis was applied and gene-regulatory networks were constructed by double-two-way t-tests in 3 states, including normal, benign, and malignant. Furthermore, network topological properties were compared between 2 states, and breast cancer-related bub genes were ranked according to their different degrees between each of the two states. A total of 2380 breast cancer-related genes and 215 transcription factors were screened by exploring databases; the genes were mainly enriched in their functions, such as cell apoptosis and proliferation, and pathways, such as p53 signaling and apoptosis, which were related with carcinogenesis. In addition, gene-regulatory networks in the 3 conditions were constructed. By comparing their network topological properties, we found that there is a larger transition of differences between malignant and benign breast cancer. Moreover, 8 hub genes (YBX1, ZFP36, YY1, XRCC5, XRCC4, ZFHX3, ZMAT3, and XPC) were identified in the top 10 genes ranked by different degrees. Through comparative analysis of gene-regulation networks, we identified the link between related genes and the pathogenesis of breast cancer. However, further experiments are needed to confirm our results.

  5. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    Science.gov (United States)

    Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

    2014-01-01

    One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.

  6. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    Directory of Open Access Journals (Sweden)

    Joeri Ruyssinck

    Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made

  7. Preferential duplication of intermodular hub genes: an evolutionary signature in eukaryotes genome networks.

    Directory of Open Access Journals (Sweden)

    Ricardo M Ferreira

    Full Text Available Whole genome protein-protein association networks are not random and their topological properties stem from genome evolution mechanisms. In fact, more connected, but less clustered proteins are related to genes that, in general, present more paralogs as compared to other genes, indicating frequent previous gene duplication episodes. On the other hand, genes related to conserved biological functions present few or no paralogs and yield proteins that are highly connected and clustered. These general network characteristics must have an evolutionary explanation. Considering data from STRING database, we present here experimental evidence that, more than not being scale free, protein degree distributions of organisms present an increased probability for high degree nodes. Furthermore, based on this experimental evidence, we propose a simulation model for genome evolution, where genes in a network are either acquired de novo using a preferential attachment rule, or duplicated with a probability that linearly grows with gene degree and decreases with its clustering coefficient. For the first time a model yields results that simultaneously describe different topological distributions. Also, this model correctly predicts that, to produce protein-protein association networks with number of links and number of nodes in the observed range for Eukaryotes, it is necessary 90% of gene duplication and 10% of de novo gene acquisition. This scenario implies a universal mechanism for genome evolution.

  8. Statistical inference and reverse engineering of gene regulatory networks from observational expression data.

    Science.gov (United States)

    Emmert-Streib, Frank; Glazko, Galina V; Altay, Gökmen; de Matos Simoes, Ricardo

    2012-01-01

    In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms.

  9. Building gene co-expression networks using transcriptomics data for systems biology investigations

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Watson-Haigh, Nathan S.

    2012-01-01

    Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four...

  10. A fast and efficient gene-network reconstruction method from multiple over-expression experiments

    Directory of Open Access Journals (Sweden)

    Thurner Stefan

    2009-08-01

    Full Text Available Abstract Background Reverse engineering of gene regulatory networks presents one of the big challenges in systems biology. Gene regulatory networks are usually inferred from a set of single-gene over-expressions and/or knockout experiments. Functional relationships between genes are retrieved either from the steady state gene expressions or from respective time series. Results We present a novel algorithm for gene network reconstruction on the basis of steady-state gene-chip data from over-expression experiments. The algorithm is based on a straight forward solution of a linear gene-dynamics equation, where experimental data is fed in as a first predictor for the solution. We compare the algorithm's performance with the NIR algorithm, both on the well known E. coli experimental data and on in-silico experiments. Conclusion We show superiority of the proposed algorithm in the number of correctly reconstructed links and discuss computational time and robustness. The proposed algorithm is not limited by combinatorial explosion problems and can be used in principle for large networks.

  11. Gene regulatory network inference and validation using relative change ratio analysis and time-delayed dynamic Bayesian network.

    Science.gov (United States)

    Li, Peng; Gong, Ping; Li, Haoni; Perkins, Edward J; Wang, Nan; Zhang, Chaoyang

    2014-12-01

    The Dialogue for Reverse Engineering Assessments and Methods (DREAM) project was initiated in 2006 as a community-wide effort for the development of network inference challenges for rigorous assessment of reverse engineering methods for biological networks. We participated in the in silico network inference challenge of DREAM3 in 2008. Here we report the details of our approach and its performance on the synthetic challenge datasets. In our methodology, we first developed a model called relative change ratio (RCR), which took advantage of the heterozygous knockdown data and null-mutant knockout data provided by the challenge, in order to identify the potential regulators for the genes. With this information, a time-delayed dynamic Bayesian network (TDBN) approach was then used to infer gene regulatory networks from time series trajectory datasets. Our approach considerably reduced the searching space of TDBN; hence, it gained a much higher efficiency and accuracy. The networks predicted using our approach were evaluated comparatively along with 29 other submissions by two metrics (area under the ROC curve and area under the precision-recall curve). The overall performance of our approach ranked the second among all participating teams.

  12. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Baumbach Jan

    2007-11-01

    Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known

  13. An integer optimization algorithm for robust identification of non-linear gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Chemmangattuvalappil Nishanth

    2012-09-01

    Full Text Available Abstract Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters

  14. Ancient and modern DNA reveal dynamics of domestication and cross-continental dispersal of the dromedary.

    Science.gov (United States)

    Almathen, Faisal; Charruau, Pauline; Mohandesan, Elmira; Mwacharo, Joram M; Orozco-terWengel, Pablo; Pitt, Daniel; Abdussamad, Abdussamad M; Uerpmann, Margarethe; Uerpmann, Hans-Peter; De Cupere, Bea; Magee, Peter; Alnaqeeb, Majed A; Salim, Bashir; Raziq, Abdul; Dessie, Tadelle; Abdelhadi, Omer M; Banabazi, Mohammad H; Al-Eknah, Marzook; Walzer, Chris; Faye, Bernard; Hofreiter, Michael; Peters, Joris; Hanotte, Olivier; Burger, Pamela A

    2016-06-14

    Dromedaries have been fundamental to the development of human societies in arid landscapes and for long-distance trade across hostile hot terrains for 3,000 y. Today they continue to be an important livestock resource in marginal agro-ecological zones. However, the history of dromedary domestication and the influence of ancient trading networks on their genetic structure have remained elusive. We combined ancient DNA sequences of wild and early-domesticated dromedary samples from arid regions with nuclear microsatellite and mitochondrial genotype information from 1,083 extant animals collected across the species' range. We observe little phylogeographic signal in the modern population, indicative of extensive gene flow and virtually affecting all regions except East Africa, where dromedary populations have remained relatively isolated. In agreement with archaeological findings, we identify wild dromedaries from the southeast Arabian Peninsula among the founders of the domestic dromedary gene pool. Approximate Bayesian computations further support the "restocking from the wild" hypothesis, with an initial domestication followed by introgression from individuals from wild, now-extinct populations. Compared with other livestock, which show a long history of gene flow with their wild ancestors, we find a high initial diversity relative to the native distribution of the wild ancestor on the Arabian Peninsula and to the brief coexistence of early-domesticated and wild individuals. This study also demonstrates the potential to retrieve ancient DNA sequences from osseous remains excavated in hot and dry desert environments.

  15. A gene regulatory network for root epidermis cell differentiation in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Angela Bruex

    2012-01-01

    Full Text Available The root epidermis of Arabidopsis provides an exceptional model for studying the molecular basis of cell fate and differentiation. To obtain a systems-level view of root epidermal cell differentiation, we used a genome-wide transcriptome approach to define and organize a large set of genes into a transcriptional regulatory network. Using cell fate mutants that produce only one of the two epidermal cell types, together with fluorescence-activated cell-sorting to preferentially analyze the root epidermis transcriptome, we identified 1,582 genes differentially expressed in the root-hair or non-hair cell types, including a set of 208 "core" root epidermal genes. The organization of the core genes into a network was accomplished by using 17 distinct root epidermis mutants and 2 hormone treatments to perturb the system and assess the effects on each gene's transcript accumulation. In addition, temporal gene expression information from a developmental time series dataset and predicted gene associations derived from a Bayesian modeling approach were used to aid the positioning of genes within the network. Further, a detailed functional analysis of likely bHLH regulatory genes within the network, including MYC1, bHLH54, bHLH66, and bHLH82, showed that three distinct subfamilies of bHLH proteins participate in root epidermis development in a stage-specific manner. The integration of genetic, genomic, and computational analyses provides a new view of the composition, architecture, and logic of the root epidermal transcriptional network, and it demonstrates the utility of a comprehensive systems approach for dissecting a complex regulatory network.

  16. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  17. Genome analysis of a simultaneously predatory and prey-independent, novel Bdellovibrio bacteriovorus from the River Tiber, supports in silico predictions of both ancient and recent lateral gene transfer from diverse bacteria

    Directory of Open Access Journals (Sweden)

    Hobley Laura

    2012-11-01

    Full Text Available Abstract Background Evolution equipped Bdellovibrio bacteriovorus predatory bacteria to invade other bacteria, digesting and replicating, sealed within them thus preventing nutrient-sharing with organisms in the surrounding environment. Bdellovibrio were previously described as “obligate predators” because only by mutations, often in gene bd0108, are 1 in ~1x107 of predatory lab strains of Bdellovibrio converted to prey-independent growth. A previous genomic analysis of B. bacteriovorus strain HD100 suggested that predatory consumption of prey DNA by lytic enzymes made Bdellovibrio less likely than other bacteria to acquire DNA by lateral gene transfer (LGT. However the Doolittle and Pan groups predicted, in silico, both ancient and recent lateral gene transfer into the B. bacteriovorus HD100 genome. Results To test these predictions, we isolated a predatory bacterium from the River Tiber- a good potential source of LGT as it is rich in diverse bacteria and organic pollutants- by enrichment culturing with E. coli prey cells. The isolate was identified as B. bacteriovorus and named as strain Tiberius. Unusually, this Tiberius strain showed simultaneous prey-independent growth on organic nutrients and predatory growth on live prey. Despite the prey-independent growth, the homolog of bd0108 did not have typical prey-independent-type mutations. The dual growth mode may reflect the high carbon content of the river, and gives B. bacteriovorus Tiberius extended non-predatory contact with the other bacteria present. The HD100 and Tiberius genomes were extensively syntenic despite their different cultured-terrestrial/freshly-isolated aquatic histories; but there were significant differences in gene content indicative of genomic flux and LGT. Gene content comparisons support previously published in silico predictions for LGT in strain HD100 with substantial conservation of genes predicted to have ancient LGT origins but little conservation of AT

  18. Fast algorithm for the reconciliation of gene trees and LGT networks.

    Science.gov (United States)

    Scornavacca, Celine; Mayol, Joan Carles Pons; Cardona, Gabriel

    2017-04-07

    In phylogenomics, reconciliations aim at explaining the discrepancies between the evolutionary histories of genes and species. Several reconciliation models are available when the evolution of the species of interest is modelled via phylogenetic trees; the most commonly used are the DL model, accounting for duplications and losses in gene evolution and yielding polynomially-solvable problems, and the DTL model, which also accounts for gene transfers and implies NP-hard problems. However, when dealing with non-tree-like evolutionary events such as hybridisations, phylogenetic networks - and not phylogenetic trees - should be used to model species evolution. Reconciliation models involving phylogenetic networks are still at their early days. In this paper, we propose a new reconciliation model in which the evolution of species is modelled by a special kind of phylogenetic networks - the LGT networks. Our model considers duplications, losses and transfers of genes, but restricts transfers to happen through some specific arcs of the network, called secondary arcs. Moreover, we provide a polynomial algorithm to compute the most parsimonious reconciliation between a gene tree and an LGT network under this model. Our method, when combined with quartet decomposition methods to detect putative "highways" of transfers, permits to refine their analyses by allowing to examine the two possible directions of a highway and even consider combinations of highways.

  19. Detection of gene communities in multi-networks reveals cancer drivers

    Science.gov (United States)

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-12-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.

  20. Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data.

    Directory of Open Access Journals (Sweden)

    Kevin Y Yip

    Full Text Available We performed computational reconstruction of the in silico gene regulatory networks in the DREAM3 Challenges. Our task was to learn the networks from two types of data, namely gene expression profiles in deletion strains (the 'deletion data' and time series trajectories of gene expression after some initial perturbation (the 'perturbation data'. In the course of developing the prediction method, we observed that the two types of data contained different and complementary information about the underlying network. In particular, deletion data allow for the detection of direct regulatory activities with strong responses upon the deletion of the regulator while perturbation data provide richer information for the identification of weaker and more complex types of regulation. We applied different techniques to learn the regulation from the two types of data. For deletion data, we learned a noise model to distinguish real signals from random fluctuations using an iterative method. For perturbation data, we used differential equations to model the change of expression levels of a gene along the trajectories due to the regulation of other genes. We tried different models, and combined their predictions. The final predictions were obtained by merging the results from the two types of data. A comparison with the actual regulatory networks suggests that our approach is effective for networks with a range of different sizes. The success of the approach demonstrates the importance of integrating heterogeneous data in network reconstruction.

  1. PINTA: a web server for network-based gene prioritization from expression data

    DEFF Research Database (Denmark)

    Nitsch, Daniela; Tranchevent, Léon-Charles; Goncalves, Joana P.

    2011-01-01

    network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization...... and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed...

  2. Disease-aging network reveals significant roles of aging genes in connecting genetic diseases.

    Science.gov (United States)

    Wang, Jiguang; Zhang, Shihua; Wang, Yong; Chen, Luonan; Zhang, Xiang-Sun

    2009-09-01

    One of the challenging problems in biology and medicine is exploring the underlying mechanisms of genetic diseases. Recent studies suggest that the relationship between genetic diseases and the aging process is important in understanding the molecular mechanisms of complex diseases. Although some intricate associations have been investigated for a long time, the studies are still in their early stages. In this paper, we construct a human disease-aging network to study the relationship among aging genes and genetic disease genes. Specifically, we integrate human protein-protein interactions (PPIs), disease-gene associations, aging-gene associations, and physiological system-based genetic disease classification information in a single graph-theoretic framework and find that (1) human disease genes are much closer to aging genes than expected by chance; and (2) diseases can be categorized into two types according to their relationships with aging. Type I diseases have their genes significantly close to aging genes, while type II diseases do not. Furthermore, we examine the topological characters of the disease-aging network from a systems perspective. Theoretical results reveal that the genes of type I diseases are in a central position of a PPI network while type II are not; (3) more importantly, we define an asymmetric closeness based on the PPI network to describe relationships between diseases, and find that aging genes make a significant contribution to associations among diseases, especially among type I diseases. In conclusion, the network-based study provides not only evidence for the intricate relationship between the aging process and genetic diseases, but also biological implications for prying into the nature of human diseases.

  3. Reverse engineering gene regulatory networks related to quorum sensing in the plant pathogen Pectobacterium atrosepticum.

    Science.gov (United States)

    Lin, Kuang; Husmeier, Dirk; Dondelinger, Frank; Mayer, Claus D; Liu, Hui; Prichard, Leighton; Salmond, George P C; Toth, Ian K; Birch, Paul R J

    2010-01-01

    The objective of the project reported in the present chapter was the reverse engineering of gene regulatory networks related to quorum sensing in the plant pathogen Pectobacterium atrosepticum from micorarray gene expression profiles, obtained from the wild-type and eight knockout strains. To this end, we have applied various recent methods from multivariate statistics and machine learning: graphical Gaussian models, sparse Bayesian regression, LASSO (least absolute shrinkage and selection operator), Bayesian networks, and nested effects models. We have investigated the degree of similarity between the predictions obtained with the different approaches, and we have assessed the consistency of the reconstructed networks in terms of global topological network properties, based on the node degree distribution. The chapter concludes with a biological evaluation of the predicted network structures.

  4. Drivers of structural features in gene regulatory networks: From biophysical constraints to biological function

    Science.gov (United States)

    Martin, O. C.; Krzywicki, A.; Zagorski, M.

    2016-07-01

    Living cells can maintain their internal states, react to changing environments, grow, differentiate, divide, etc. All these processes are tightly controlled by what can be called a regulatory program. The logic of the underlying control can sometimes be guessed at by examining the network of influences amongst genetic components. Some associated gene regulatory networks have been studied in prokaryotes and eukaryotes, unveiling various structural features ranging from broad distributions of out-degrees to recurrent "motifs", that is small subgraphs having a specific pattern of interactions. To understand what factors may be driving such structuring, a number of groups have introduced frameworks to model the dynamics of gene regulatory networks. In that context, we review here such in silico approaches and show how selection for phenotypes, i.e., network function, can shape network structure.

  5. Inference of gene regulatory networks with the strong-inhibition Boolean model

    Energy Technology Data Exchange (ETDEWEB)

    Xia Qinzhi; Liu Lulu; Ye Weiming; Hu Gang, E-mail: ganghu@bnu.edu.cn [Department of Physics, Beijing Normal University, Beijing 100875 (China)

    2011-08-15

    The inference of gene regulatory networks (GRNs) is an important topic in biology. In this paper, a logic-based algorithm that infers the strong-inhibition Boolean genetic regulatory networks (where regulation by any single repressor can definitely suppress the expression of the gene regulated) from time series is discussed. By properly ordering various computation steps, we derive for the first time explicit formulae for the probabilities at which different interactions can be inferred given a certain number of data. With the formulae, we can predict the precision of reconstructions of regulation networks when the data are insufficient. Numerical simulations coincide well with the analytical results. The method and results are expected to be applicable to a wide range of general dynamic networks, where logic algorithms play essential roles in the network dynamics and the probabilities of various logics can be estimated well.

  6. Characterization of differentially expressed genes using high-dimensional co-expression networks

    DEFF Research Database (Denmark)

    Coelho Goncalves de Abreu, Gabriel; Labouriau, Rodrigo S.

    2010-01-01

    of spurious information along the network are avoided. The proposed inference procedure is based on the minimization of the Bayesian Information Criterion (BIC) in the class of decomposable graphical models. This class of models can be used to represent complex relationships and has suitable properties...... that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we...... construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than...

  7. Gene perturbation and intervention in context-sensitive stochastic Boolean networks

    Science.gov (United States)

    2014-01-01

    Background In a gene regulatory network (GRN), gene expressions are affected by noise, and stochastic fluctuations exist in the interactions among genes. These stochastic interactions are context dependent, thus it becomes important to consider noise in a context-sensitive manner in a network model. As a logical model, context-sensitive probabilistic Boolean networks (CSPBNs) account for molecular and genetic noise in the temporal context of gene functions. In a CSPBN with n genes and k contexts, however, a computational complexity of O(nk222n ) (or O(nk2 n )) is required for an accurate (or approximate) computation of the state transition matrix (STM) of the size (2 n ∙ k) × (2 n ∙ k) (or 2 n × 2 n ). The evaluation of a steady state distribution (SSD) is more challenging. Recently, stochastic Boolean networks (SBNs) have been proposed as an efficient implementation of an instantaneous PBN. Results The notion of stochastic Boolean networks (SBNs) is extended for the general model of PBNs, i.e., CSPBNs. This yields a novel structure of context-sensitive SBNs (CSSBNs) for modeling the stochasticity in a GRN. A CSSBN enables an efficient simulation of a CSPBN with a complexity of O(nLk2 n ) for computing the state transition matrix, where L is a factor related to the required sequence length in CSSBN for achieving a desired accuracy. A time-frame expanded CSSBN can further efficiently simulate the stationary behavior of a CSPBN and allow for a tunable tradeoff between accuracy and efficiency. The CSSBN approach is more efficient than an analytical method and more accurate than an approximate analysis. Conclusions Context-sensitive stochastic Boolean networks (CSSBNs) are proposed as an efficient approach to modeling the effects of gene perturbation and intervention in gene regulatory networks. A CSSBN analysis provides biologically meaningful insights into the oscillatory dynamics of the p53-Mdm2 network in a context-switching environment. It is shown that

  8. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data

    Science.gov (United States)

    Faria, José P.; Overbeek, Ross; Taylor, Ronald C.; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S.

    2016-01-01

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of Bacillus subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs, and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches, and small regulatory RNAs. Overall, regulatory information is included in the model for ∼2500 of the ∼4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how ARs for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how ARs can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental

  9. Reconstruction of the Regulatory Network for Bacillus subtilis and Reconciliation with Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Faria, José P.; Overbeek, Ross; Taylor, Ronald C.; Conrad, Neal; Vonstein, Veronika; Goelzer, Anne; Fromion, Vincent; Rocha, Miguel; Rocha, Isabel; Henry, Christopher S.

    2016-03-18

    We introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, we reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata

  10. Multiobjective H2/H∞ synthetic gene network design based on promoter libraries.

    Science.gov (United States)

    Wu, Chih-Hung; Zhang, Weihei; Chen, Bor-Sen

    2011-10-01

    Some current promoter libraries have been developed for synthetic gene networks. But an efficient method to engineer a synthetic gene network with some desired behaviors by selecting adequate promoters from these promoter libraries has not been presented. Thus developing a systematic method to efficiently employ promoter libraries to improve the engineering of synthetic gene networks with desired behaviors is appealing for synthetic biologists. In this study, a synthetic gene network with intrinsic parameter fluctuations and environmental disturbances in vivo is modeled by a nonlinear stochastic system. In order to engineer a synthetic gene network with a desired behavior despite intrinsic parameter fluctuations and environmental disturbances in vivo, a multiobjective H(2)/H(∞) reference tracking (H(2) optimal tracking and H(∞) noise filtering) design is introduced. The H(2) optimal tracking can make the tracking errors between the behaviors of a synthetic gene network and the desired behaviors as small as possible from the minimum mean square error point of view, and the H(∞) noise filtering can attenuate all possible noises, from the worst-case noise effect point of view, to achieve a desired noise filtering ability. If the multiobjective H(2)/H(∞) reference tracking design is satisfied, the synthetic gene network can robustly and optimally track the desired behaviors, simultaneously. First, based on the dynamic gene regulation, the existing promoter libraries are redefined by their promoter activities so that they can be efficiently selected in the design procedure. Then a systematic method is developed to select an adequate promoter set from the redefined promoter libraries to synthesize a gene network satisfying these two design objectives. But the multiobjective H(2)/H(∞) reference tracking design problem needs to solve a difficult Hamilton-Jacobi Inequality (HJI)-constrained optimization problem. Therefore, the fuzzy approximation method is

  11. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

    Science.gov (United States)

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-12-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.

  12. Investigating meta-approaches for reconstructing gene networks in a mammalian cellular context.

    Directory of Open Access Journals (Sweden)

    Azree Nazri

    Full Text Available The output of state-of-the-art reverse-engineering methods for biological networks is often based on the fitting of a mathematical model to the data. Typically, different datasets do not give single consistent network predictions but rather an ensemble of inconsistent networks inferred under the same reverse-engineering method that are only consistent with the specific experimentally measured data. Here, we focus on an alternative approach for combining the information contained within such an ensemble of inconsistent gene networks called meta-analysis, to make more accurate predictions and to estimate the reliability of these predictions. We review two existing meta-analysis approaches; the Fisher transformation combined coefficient test (FTCCT and Fisher's inverse combined probability test (FICPT; and compare their performance with five well-known methods, ARACNe, Context Likelihood or Relatedness network (CLR, Maximum Relevance Minimum Redundancy (MRNET, Relevance Network (RN and Bayesian Network (BN. We conducted in-depth numerical ensemble simulations and demonstrated for biological expression data that the meta-analysis approaches consistently outperformed the best gene regulatory network inference (GRNI methods in the literature. Furthermore, the meta-analysis approaches have a low computational complexity. We conclude that the meta-analysis approaches are a powerful tool for integrating different datasets to give more accurate and reliable predictions for biological networks.

  13. Candidate gene prioritization by network analysis of differential expression using machine learning approaches

    Directory of Open Access Journals (Sweden)

    Nitsch Daniela

    2010-09-01

    Full Text Available Abstract Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking. Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we

  14. Identify the signature genes for diagnose of uveal melanoma by weight gene co-expression network analysis

    Institute of Scientific and Technical Information of China (English)

    Kai; Shi; Zhi-Tong; Bing; Gui-Qun; Cao; Ling; Guo; Ya-Na; Cao; Hai-Ou; Jiang; Mei-Xia; Zhang

    2015-01-01

    AIM: To identify and understand the relationship between co-expression pattern and clinic traits in uveal melanoma, weighted gene co-expression network analysis(WGCNA) is applied to investigate the gene expression levels and patient clinic features. Uveal melanoma is the most common primary eye tumor in adults. Although many studies have identified some important genes and pathways that were relevant to progress of uveal melanoma, the relationship between co-expression and clinic traits in systems level of uveal melanoma is unclear yet. We employ WGCNA to investigate the relationship underlying molecular and phenotype in this study.METHODS: Gene expression profile of uveal melanoma and patient clinic traits were collected from the Gene Expression Omnibus(GEO) database. The gene co-expression is calculated by WGCNA that is the R package software. The package is used to analyze the correlation between pairs of expression levels of genes.The function of the genes were annotated by gene ontology(GO).RESULTS: In this study, we identified four co-expression modules significantly correlated with clinictraits. Module blue positively correlated with radiotherapy treatment. Module purple positively correlates with tumor location(sclera) and negatively correlates with patient age. Module red positively correlates with sclera and negatively correlates with thickness of tumor. Module black positively correlates with the largest tumor diameter(LTD). Additionally, we identified the hug gene(top connectivity with other genes) in each module. The hub gene RPS15 A, PTGDS, CD53 and MSI2 might play a vital role in progress of uveal melanoma.CONCLUSION: From WGCNA analysis and hub gene calculation, we identified RPS15 A, PTGDS, CD53 and MSI2 might be target or diagnosis for uveal melanoma.

  15. Gene switching rate determines response to extrinsic perturbations in the self-activation transcriptional network motif

    Science.gov (United States)

    de Franciscis, Sebastiano; Caravagna, Giulio; Mauri, Giancarlo; D’Onofrio, Alberto

    2016-06-01

    Gene switching dynamics is a major source of randomness in genetic networks, also in the case of large concentrations of the transcription factors. In this work, we consider a common network motif - the positive feedback of a transcription factor on its own synthesis - and assess its response to extrinsic noises perturbing gene deactivation in a variety of settings where the network might operate. These settings are representative of distinct cellular types, abundance of transcription factors and ratio between gene switching and protein synthesis rates. By investigating noise-induced transitions among the different network operative states, our results suggest that gene switching rates are key parameters to shape network response to external perturbations, and that such response depends on the particular biological setting, i.e. the characteristic time scales and protein abundance. These results might have implications on our understanding of irreversible transitions for noise-related phenomena such as cellular differentiation. In addition these evidences suggest to adopt the appropriate mathematical model of the network in order to analyze the system consistently to the reference biological setting.

  16. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Oded Magger

    Full Text Available The prioritization of candidate disease-causing genes is a fundamental challenge in the post-genomic era. Current state of the art methods exploit a protein-protein interaction (PPI network for this task. They are based on the observation that genes causing phenotypically-similar diseases tend to lie close to one another in a PPI network. However, to date, these methods have used a static picture of human PPIs, while diseases impact specific tissues in which the PPI networks may be dramatically different. Here, for the first time, we perform a large-scale assessment of the contribution of tissue-specific information to gene prioritization. By integrating tissue-specific gene expression data with PPI information, we construct tissue-specific PPI networks for 60 tissues and investigate their prioritization power. We find that tissue-specific PPI networks considerably improve the prioritization results compared to those obtained using a generic PPI network. Furthermore, they allow predicting novel disease-tissue associations, pointing to sub-clinical tissue effects that may escape early detection.

  17. Identifying disease-specific genes based on their topological significance in protein networks

    Directory of Open Access Journals (Sweden)

    Cherba David

    2009-03-01

    Full Text Available Abstract Background The identification of key target nodes within complex molecular networks remains a common objective in scientific research. The results of pathway analyses are usually sets of fairly complex networks or functional processes that are deemed relevant to the condition represented by the molecular profile. To be useful in a research or clinical laboratory, the results need to be translated to the level of testable hypotheses about individual genes and proteins within the condition of interest. Results In this paper we describe novel computational methodology capable of predicting key regulatory genes and proteins in disease- and condition-specific biological networks. The algorithm builds shortest path network connecting condition-specific genes (e.g. differentially expressed genes using global database of protein interactions from MetaCore. We evaluate the number of all paths traversing each node in the shortest path network in relation to the total number of paths going via the same node in the global network. Using these numbers and the relative size of the initial data set, we determine the statistical significance of the network connectivity provided through each node. We applied this method to gene expression data from psoriasis patients and identified many confirmed biological targets of psoriasis and suggested several new targets. Using predicted regulatory nodes we were able to reconstruct disease pathways that are in excellent agreement with the current knowledge on the pathogenesis of psoriasis. Conclusion The systematic and automated approach described in this paper is readily applicable to uncovering high-quality therapeutic targets, and holds great promise for developing network-based combinational treatment strategies for a wide range of diseases.

  18. A boolean model of the cardiac gene regulatory network determining first and second heart field identity.

    Directory of Open Access Journals (Sweden)

    Franziska Herrmann

    Full Text Available Two types of distinct cardiac progenitor cell populations can be identified during early heart development: the first heart field (FHF and second heart field (SHF lineage that later form the mature heart. They can be characterized by differential expression of transcription and signaling factors. These regulatory factors influence each other forming a gene regulatory network. Here, we present a core gene regulatory network for early cardiac development based on published temporal and spatial expression data of genes and their interactions. This gene regulatory network was implemented in a Boolean computational model. Simulations reveal stable states within the network model, which correspond to the regulatory states of the FHF and the SHF lineages. Furthermore, we are able to reproduce the expected temporal expression patterns of early cardiac factors mimicking developmental progression. Additionally, simulations of knock-down experiments within our model resemble published phenotypes of mutant mice. Consequently, this gene regulatory network retraces the early steps and requirements of cardiogenic mesoderm determination in a way appropriate to enhance the understanding of heart development.

  19. A Boolean Model of the Cardiac Gene Regulatory Network Determining First and Second Heart Field Identity

    Science.gov (United States)

    Zhou, Dao; Kestler, Hans A.; Kühl, Michael

    2012-01-01

    Two types of distinct cardiac progenitor cell populations can be identified during early heart development: the first heart field (FHF) and second heart field (SHF) lineage that later form the mature heart. They can be characterized by differential expression of transcription and signaling factors. These regulatory factors influence each other forming a gene regulatory network. Here, we present a core gene regulatory network for early cardiac development based on published temporal and spatial expression data of genes and their interactions. This gene regulatory network was implemented in a Boolean computational model. Simulations reveal stable states within the network model, which correspond to the regulatory states of the FHF and the SHF lineages. Furthermore, we are able to reproduce the expected temporal expression patterns of early cardiac factors mimicking developmental progression. Additionally, simulations of knock-down experiments within our model resemble published phenotypes of mutant mice. Consequently, this gene regulatory network retraces the early steps and requirements of cardiogenic mesoderm determination in a way appropriate to enhance the understanding of heart development. PMID:23056457

  20. A network-based gene-weighting approach for pathway analysis

    Institute of Scientific and Technical Information of China (English)

    Zhaoyuan Fang; Weidong Tian; Hongbin Ji

    2012-01-01

    Classical algorithms aiming at identifying biological pathways significantly related to studying conditions frequently reduced pathways to gene sets,with an obvious ignorance of the constitutive non-equivalence of various genes within a defined pathway.We here designed a network-based method to determine such non-equivalence in terms of gene weights.The gene weights determined are biologically consistent and robust to network perturbations.By integrating the gene weights into the classical gene set analysis,with a subsequent correction for the “over-counting”bias associated with multi-subunit proteins,we have developed a novel gene-weighed pathway analysis approach,as implemented in an R package called “Gene Associaqtion Network-based Pathway Analysis”(GANPA).Through analysis of several microarray datasets,including the p53 dataset,asthma dataset and three breast cancer datasets,we demonstrated that our approach is biologically reliable and reproducible,and therefore helpful for microarray data interpretation and hypothesis generation.

  1. The Relationship between Gene Network Structure and Expression Variation among Individuals and Species.

    Directory of Open Access Journals (Sweden)

    Karen E Sears

    2015-08-01

    Full Text Available Variation among individuals is a prerequisite of evolution by natural selection. As such, identifying the origins of variation is a fundamental goal of biology. We investigated the link between gene interactions and variation in gene expression among individuals and species using the mammalian limb as a model system. We first built interaction networks for key genes regulating early (outgrowth; E9.5-11 and late (expansion and elongation; E11-13 limb development in mouse. This resulted in an Early (ESN and Late (LSN Stage Network. Computational perturbations of these networks suggest that the ESN is more robust. We then quantified levels of the same key genes among mouse individuals and found that they vary less at earlier limb stages and that variation in gene expression is heritable. Finally, we quantified variation in gene expression levels among four mammals with divergent limbs (bat, opossum, mouse and pig and found that levels vary less among species at earlier limb stages. We also found that variation in gene expression levels among individuals and species are correlated for earlier and later limb development. In conclusion, results are consistent with the robustness of the ESN buffering among-individual variation in gene expression levels early in mammalian limb development, and constraining the evolution of early limb development among mammalian species.

  2. The effect of negative feedback on noise propagation in transcriptional gene networks

    Science.gov (United States)

    Hooshangi, Sara; Weiss, Ron

    2006-06-01

    This paper analyzes how the delay and repression strength of negative feedback in single-gene and multigene transcriptional networks influences intrinsic noise propagation and oscillatory behavior. We simulate a variety of transcriptional networks using a stochastic model and report two main findings. First, intrinsic noise is not attenuated by the addition of negative or positive feedback to transcriptional cascades. Second, for multigene negative feedback networks, synchrony in oscillations among a cell population can be improved by increasing network depth and tightening the regulation at one of the repression stages. Our long term goal is to understand how the noise characteristics of complex networks can be derived from the properties of modules that are used to compose these networks.

  3. Assigning numbers to the arrows: Parameterizing a gene regulation network by using accurate expression kinetics

    Science.gov (United States)

    Ronen, Michal; Rosenberg, Revital; Shraiman, Boris I.; Alon, Uri

    2002-08-01

    A basic challenge in systems biology is to understand the dynamical behavior of gene regulation networks. Current approaches aim at determining the network structure based on genomic-scale data. However, the network connectivity alone is not sufficient to define its dynamics; one needs to also specify the kinetic parameters for the regulation reactions. Here, we ask whether effective kinetic parameters can be assigned to a transcriptional network based on expression data. We present a combined experimental and theoretical approach based on accurate high temporal-resolution measurement of promoter activities from living cells by using green fluorescent protein (GFP) reporter plasmids. We present algorithms that use these data to assign effective kinetic parameters within a mathematical model of the network. To demonstrate this, we employ a well defined network, the SOS DNA repair system of Escherichia coli. We find a strikingly detailed temporal program of expression that correlates with the functional role of the SOS genes and is driven by a hierarchy of effective kinetic parameter strengths for the various promoters. The calculated parameters can be used to determine the kinetics of all SOS genes given the expression profile of just one representative, allowing a significant reduction in complexity. The concentration profile of the master SOS transcriptional repressor can be calculated, demonstrating that relative protein levels may be determined from purely transcriptional data. This finding opens the possibility of assigning kinetic parameters to transcriptional networks on a genomic scale.

  4. Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets.

    Science.gov (United States)

    Vinayagam, Arunachalam; Gibson, Travis E; Lee, Ho-Joon; Yilmazel, Bahar; Roesel, Charles; Hu, Yanhui; Kwon, Young; Sharma, Amitabh; Liu, Yang-Yu; Perrimon, Norbert; Barabási, Albert-László

    2016-05-03

    The protein-protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as "indispensable," "neutral," or "dispensable," which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network's control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

  5. Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

    Directory of Open Access Journals (Sweden)

    Gao Shouguo

    2011-08-01

    Full Text Available Abstract Background Bayesian Network (BN is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. Results We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naïve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. Conclusion our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.

  6. Compensatory evolution for a gene deletion is not limited to its immediate functional network

    Directory of Open Access Journals (Sweden)

    Bull JJ

    2009-05-01

    Full Text Available Abstract Background Genetic disruption of an important phenotype should favor compensatory mutations that restore the phenotype. If the genetic basis of the phenotype is modular, with a network of interacting genes whose functions are specific to that phenotype, compensatory mutations are expected among the genes of the affected network. This perspective was tested in the bacteriophage T3 using a genome deleted of its DNA ligase gene, disrupting DNA metabolism. Results In two replicate, long-term adaptations, phage compensatory evolution accommodated the low ligase level provided by the host without reinventing its own ligase. In both lines, fitness increased substantially but remained well below that of the intact genome. Each line accumulated over a dozen compensating mutations during long-term adaptation, and as expected, many of the compensatory changes were within the DNA metabolism network. However, several compensatory changes were outside the network and defy any role in DNA metabolism or biochemical connection to the disruption. In one line, these extra-network changes were essential to the recovery. The genes experiencing compensatory changes were moderately conserved between T3 and its relative T7 (25% diverged, but the involvement of extra-network changes was greater in T3. Conclusion Compensatory evolution was only partly limited to the known functionally interacting partners of the deleted gene. Thus gene interactions contributing to fitness were more extensive than suggested by the functional properties currently ascribed to the genes. Compensatory evolution offers an easy method of discovering genome interactions among specific elements that does not rest on an a priori knowledge of those elements or their interactions.

  7. Mitochondrial DNA analysis of ancient Sampula population in Xinjiang

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    The archaeological site of Sampula cemetery was located about 14 km to the southwest of the Luo County in Xinjiang Khotan, China, belonging to the ancient Yutian kingdom. 14C analysis showed that this cemetery was used from 217 B.C. to 283 A.D.Ancient DNA was analyzed by 364 bp of the mitochondrial DNA hypervariable region Ⅰ (mtDNA HVR-Ⅰ), and by six restriction fragment length polymorphism (RFLP) sites of mtDNA coding region. We successfully extracted and sequenced intact stretches of maternally inherited mtDNA from 13 out of 16 ancient Sampula samples. The analysis of mtDNA haplogroup distribution showed that the ancient Sampula was a complex population with both European and Asian characteristics. Median joining network of U3 sub-haplogroup and multi-dimensional scaling analysis all showed that the ancient Sampula had maternal relationship with Ossetian and Iranian.

  8. Integrated Weighted Gene Co-expression Network Analysis with an Application to Chronic Fatigue Syndrome

    Directory of Open Access Journals (Sweden)

    Rajeevan Mangalathu S

    2008-11-01

    Full Text Available Abstract Background Systems biologic approaches such as Weighted Gene Co-expression Network Analysis (WGCNA can effectively integrate gene expression and trait data to identify pathways and candidate biomarkers. Here we show that the additional inclusion of genetic marker data allows one to characterize network relationships as causal or reactive in a chronic fatigue syndrome (CFS data set. Results We combine WGCNA with genetic marker data to identify a disease-related pathway and its causal drivers, an analysis which we refer to as "Integrated WGCNA" or IWGCNA. Specifically, we present the following IWGCNA approach: 1 construct a co-expression network, 2 identify trait-related modules within the network, 3 use a trait-related genetic marker to prioritize genes within the module, 4 apply an integrated gene screening strategy to identify candidate genes and 5 carry out causality testing to verify and/or prioritize results. By applying this strategy to a CFS data set consisting of microarray, SNP and clinical trait data, we identify a module of 299 highly correlated genes that is associated with CFS severity. Our integrated gene screening strategy results in 20 candidate genes. We show that our approach yields biologically interesting genes that function in the same pathway and are causal drivers for their parent module. We use a separate data set to replicate findings and use Ingenuity Pathways Analysis software to functionally annotate the candidate gene pathways. Conclusion We show how WGCNA can be combined with genetic marker data to identify disease-related pathways and the causal drivers within them. The systems genetics approach described here can easily be used to generate testable genetic hypotheses in other complex disease studies.

  9. EcoliNet: a database of cofunctional gene network for Escherichia coli.

    Science.gov (United States)

    Kim, Hanhae; Shim, Jung Eun; Shin, Junha; Lee, Insuk

    2015-01-01

    During the past several decades, Escherichia coli has been a treasure chest for molecular biology. The molecular mechanisms of many fundamental cellular processes have been discovered through research on this bacterium. Although much basic research now focuses on more complex model organisms, E. coli still remains important in metabolic engineering and synthetic biology. Despite its long history as a subject of molecular investigation, more than one-third of the E. coli genome has no pathway annotation supported by either experimental evidence or manual curation. Recently, a network-assisted genetics approach to the efficient identification of novel gene functions has increased in popularity. To accelerate the speed of pathway annotation for the remaining uncharacterized part of the E. coli genome, we have constructed a database of cofunctional gene network with near-complete genome coverage of the organism, dubbed EcoliNet. We find that EcoliNet is highly predictive for diverse bacterial phenotypes, including antibiotic response, indicating that it will be useful in prioritizing novel candidate genes for a wide spectrum of bacterial phenotypes. We have implemented a web server where biologists can easily run network algorithms over EcoliNet to predict novel genes involved in a pathway or novel functions for a gene. All integrated cofunctional associations can be downloaded, enabling orthology-based reconstruction of gene networks for other bacterial species as well. Database URL: http://www.inetbio.org/ecolinet.

  10. Dentistry in ancient mesopotamia.

    Science.gov (United States)

    Neiburger, E J

    2000-01-01

    Sumer, an empire in ancient Mesopotamia (southern Iraq), is well known as the cradle of our modern civilization and the home of biblical Abraham. An analysis of skeletal remains from cemeteries at the ancient cities of Ur and Kish (circa 2000 B.C.), show a genetically homogeneous, diseased, and short-lived population. These ancient Mesopotamians suffered severe dental attrition (95 percent), periodontal disease (42 percent), and caries (2 percent). Many oral congenital and neoplastic lesions were noted. During this period, the "local dentists" knew only a few modern dental techniques. Skeletal (dental) evidence indicates that the population suffered from chronic malnutrition. Malnutrition was probably caused by famine, which is substantiated in historic cuneiform and biblical writings, geologic strata samples, and analysis of skeletal and forensic dental pathology. These people had modern dentition but relatively poor dental health. The population's lack of malocclusions, caries, and TMJ problems appear to be due to flat plane occlusion.

  11. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs.

  12. FastGCN: a GPU accelerated tool for fast gene co-expression networks.

    Directory of Open Access Journals (Sweden)

    Meimei Liang

    Full Text Available Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.

  13. FastGCN: a GPU accelerated tool for fast gene co-expression networks.

    Science.gov (United States)

    Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun

    2015-01-01

    Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.

  14. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    Science.gov (United States)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  15. Dwarfs in ancient Egypt.

    Science.gov (United States)

    Kozma, Chahira

    2006-02-15

    Ancient Egypt was one of the most advanced and productive civilizations in antiquity, spanning 3000 years before the "Christian" era. Ancient Egyptians built colossal temples and magnificent tombs to honor their gods and religious leaders. Their hieroglyphic language, system of organization, and recording of events give contemporary researchers insights into their daily activities. Based on the record left by their art, the ancient Egyptians documented the presence of dwarfs in almost every facet of life. Due to the hot dry climate and natural and artificial mummification, Egypt is a major source of information on achondroplasia in the old world. The remains of dwarfs are abundant and include complete and partial skeletons. Dwarfs were employed as personal attendants, animal tenders, jewelers, and entertainers. Several high-ranking dwarfs especially from the Old Kingdom (2700-2190 BCE) achieved important status and had lavish burial places close to the pyramids. Their costly tombs in the royal cemeteries and the inscriptions on their statutes indicate their high-ranking position in Egyptian society and their close relation to the king. Some of them were Seneb, Pereniankh, Khnumhotpe, and Djeder. There were at least two dwarf gods, Ptah and Bes. The god Ptah was associated with regeneration and rejuvenation. The god Bes was a protector of sexuality, childbirth, women, and children. He was a favored deity particularly during the Greco-Roman period. His temple was recently excavated in the Baharia oasis in the middle of Egypt. The burial sites and artistic sources provide glimpses of the positions of dwarfs in daily life in ancient Egypt. Dwarfs were accepted in ancient Egypt; their recorded daily activities suggest assimilation into daily life, and their disorder was not shown as a physical handicap. Wisdom writings and moral teachings in ancient Egypt commanded respect for dwarfs and other individuals with disabilities.

  16. Identification of gene networks underlying dystocia in dairy cattle

    Science.gov (United States)

    Dystocia is a trait with a high impact in the dairy industry. Among its risk factors are calf weight, gestation length, breed and conformation. Biological networks have been proposed to capture the genetic architecture of complex traits, where GWAS show limitations. The objective of this study was t...

  17. Integration of biological networks and gene expression data using Cytoscape

    DEFF Research Database (Denmark)

    Cline, M.S.; Smoot, M.; Cerami, E.

    2007-01-01

    Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context...

  18. Identification of hub genes of pneumocyte senescence induced by thoracic irradiation using weighted gene co-expression network analysis

    Science.gov (United States)

    XING, YONGHUA; ZHANG, JUNLING; LU, LU; LI, DEGUAN; WANG, YUEYING; HUANG, SONG; LI, CHENGCHENG; ZHANG, ZHUBO; LI, JIANGUO; MENG, AIMIN

    2016-01-01

    Irradiation commonly causes pneumocyte senescence, which may lead to severe fatal lung injury characterized by pulmonary dysfunction and respiratory failure. However, the molecular mechanism underlying the induction of pneumocyte senescence by irradiation remains to be elucidated. In the present study, weighted gene co-expression network analysis (WGCNA) was used to screen for differentially expressed genes, and to identify the hub genes and gene modules, which may be critical for senescence. A total of 2,916 differentially expressed genes were identified between the senescence and non-senescence groups following thoracic irradiation. In total, 10 gene modules associated with cell senescence were detected, and six hub genes were identified, including B-cell scaffold protein with ankyrin repeats 1, translocase of outer mitochondrial membrane 70 homolog A, actin filament-associated protein 1, Cd84, Nuf2 and nuclear factor erythroid 2. These genes were markedly associated with cell proliferation, cell division and cell cycle arrest. The results of the present study demonstrated that WGCNA of microarray data may provide further insight into the molecular mechanism underlying pneumocyte senescence. PMID:26572216

  19. Developmental evolution in social insects: regulatory networks from genes to societies.

    Science.gov (United States)

    Linksvayer, Timothy A; Fewell, Jennifer H; Gadau, Jürgen; Laubichler, Manfred D

    2012-05-01

    The evolution and development of complex phenotypes in social insect colonies, such as queen-worker dimorphism or division of labor, can, in our opinion, only be fully understood within an expanded mechanistic framework of Developmental Evolution. Conversely, social insects offer a fertile research area in which fundamental questions of Developmental Evolution can be addressed empirically. We review the concept of gene regulatory networks (GRNs) that aims to fully describe the battery of interacting genomic modules that are differentially expressed during the development of individual organisms. We discuss how distinct types of network models have been used to study different levels of biological organization in social insects, from GRNs to social networks. We propose that these hierarchical networks spanning different organizational levels from genes to societies should be integrated and incorporated into full GRN models to elucidate the evolutionary and developmental mechanisms underlying social insect phenotypes. Finally, we discuss prospects and approaches to achieve such an integration.

  20. Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain.

    Science.gov (United States)

    Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C

    2016-01-26

    The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.

  1. Distilling a Visual Network of Retinitis Pigmentosa Gene-Protein Interactions to Uncover New Disease Candidates.

    Directory of Open Access Journals (Sweden)

    Daniel Boloc

    Full Text Available Retinitis pigmentosa (RP is a highly heterogeneous genetic visual disorder with more than 70 known causative genes, some of them shared with other non-syndromic retinal dystrophies (e.g. Leber congenital amaurosis, LCA. The identification of RP genes has increased steadily during the last decade, and the 30% of the cases that still remain unassigned will soon decrease after the advent of exome/genome sequencing. A considerable amount of genetic and functional data on single RD genes and mutations has been gathered, but a comprehensive view of the RP genes and their interacting partners is still very fragmentary. This is the main gap that needs to be filled in order to understand how mutations relate to progressive blinding disorders and devise effective therapies.We have built an RP-specific network (RPGeNet by merging data from different sources: high-throughput data from BioGRID and STRING databases, manually curated data for interactions retrieved from iHOP, as well as interactions filtered out by syntactical parsing from up-to-date abstracts and full-text papers related to the RP research field. The paths emerging when known RP genes were used as baits over the whole interactome have been analysed, and the minimal number of connections among the RP genes and their close neighbors were distilled in order to simplify the search space.In contrast to the analysis of single isolated genes, finding the networks linking disease genes renders powerful etiopathological insights. We here provide an interactive interface, RPGeNet, for the molecular biologist to explore the network centered on the non-syndromic and syndromic RP and LCA causative genes. By integrating tissue-specific expression levels and phenotypic data on top of that network, a more comprehensive biological view will highlight key molecular players of retinal degeneration and unveil new RP disease candidates.

  2. Linear fuzzy gene network models obtained from microarray data by exhaustive search

    Directory of Open Access Journals (Sweden)

    Quong Judy N

    2004-08-01

    Full Text Available Abstract Background Recent technological advances in high-throughput data collection allow for experimental study of increasingly complex systems on the scale of the whole cellular genome and proteome. Gene network models are needed to interpret the resulting large and complex data sets. Rationally designed perturbations (e.g., gene knock-outs can be used to iteratively refine hypothetical models, suggesting an approach for high-throughput biological system analysis. We introduce an approach to gene network modeling based on a scalable linear variant of fuzzy logic: a framework with greater resolution than Boolean logic models, but which, while still semi-quantitative, does not require the precise parameter measurement needed for chemical kinetics-based modeling. Results We demonstrated our approach with exhaustive search for fuzzy gene interaction models that best fit transcription measurements by microarray of twelve selected genes regulating the yeast cell cycle. Applying an efficient, universally applicable data normalization and fuzzification scheme, the search converged to a small number of models that individually predict experimental data within an error tolerance. Because only gene transcription levels are used to develop the models, they include both direct and indirect regulation of genes. Conclusion Biological relationships in the best-fitting fuzzy gene network models successfully recover direct and indirect interactions predicted from previous knowledge to result in transcriptional correlation. Fuzzy models fit on one yeast cell cycle data set robustly predict another experimental data set for the same system. Linear fuzzy gene networks and exhaustive rule search are the first steps towards a framework for an integrated modeling and experiment approach to high-throughput "reverse engineering" of complex biological systems.

  3. Learning Biological Networks via Bootstrapping with Optimized GO-based Gene Similarity

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Sanfilippo, Antonio P.; McDermott, Jason E.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.; Verhagen, Marc

    2010-08-02

    Microarray gene expression data provide a unique information resource for learning biological networks using "reverse engineering" methods. However, there are a variety of cases in which we know which genes are involved in a given pathology of interest, but we do not have enough experimental evidence to support the use of fully-supervised/reverse-engineering learning methods. In this paper, we explore a novel semi-supervised approach in which biological networks are learned from a reference list of genes and a partial set of links for these genes extracted automatically from PubMed abstracts, using a knowledge-driven bootstrapping algorithm. We show how new relevant links across genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. We describe an application of this approach to the TGFB pathway as a case study and show how the ensuing results prove the feasibility of the approach as an alternate or complementary technique to fully supervised methods.

  4. Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

    Directory of Open Access Journals (Sweden)

    Xing Li

    2014-01-01

    Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

  5. Enriching regulatory networks by bootstrap learning using optimised GO-based gene similarity and gene links mined from PubMed abstracts

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Sanfilippo, Antonio P.; McDermott, Jason E.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.; Verhagen, Marc; Pustejovsky, James

    2011-02-18

    Transcriptional regulatory networks are being determined using “reverse engineering” methods that infer connections based on correlations in gene state. Corroboration of such networks through independent means such as evidence from the biomedical literature is desirable. Here, we explore a novel approach, a bootstrapping version of our previous Cross-Ontological Analytic method (XOA) that can be used for semi-automated annotation and verification of inferred regulatory connections, as well as for discovery of additional functional relationships between the genes. First, we use our annotation and network expansion method on a biological network learned entirely from the literature. We show how new relevant links between genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. Second, we apply our method to annotation, verification, and expansion of a set of regulatory connections found by the Context Likelihood of Relatedness algorithm.

  6. Epigenetic Modulation of Brain Gene Networks for Cocaine and Alcohol Abuse

    Directory of Open Access Journals (Sweden)

    Sean P Farris

    2015-05-01

    Full Text Available Cocaine and alcohol are two substances of abuse that prominently affect the central nervous system (CNS. Repeated exposure to cocaine and alcohol leads to longstanding changes in gene expression, and subsequent functional CNS plasticity, throughout multiple brain regions. Epigenetic modifications of histones are one proposed mechanism guiding these enduring changes to the transcriptome. Characterizing the large number of available biological relationships as network models can reveal unexpected biochemical relationships. Clustering analysis of variation from whole-genome sequencing of gene expression (RNA-Seq and histone H3 lysine 4 trimethylation (H3K4me3 events (ChIP-Seq revealed the underlying structure of the transcriptional and epigenomic landscape within hippocampal postmortem brain tissue of drug abusers and control cases. Distinct sets of interrelated networks for cocaine and alcohol abuse were determined for each abusive substance. The network approach identified subsets of functionally related genes that are regulated in agreement with H3K4me3 changes, suggesting cause and effect relationships between this epigenetic mark and gene expression. Gene expression networks consisted of recognized substrates for addiction, such as the dopamine- and cAMP-regulated neuronal phosphoprotein PPP1R1B / DARPP-32 and the vesicular glutamate transporter SLC17A7 / VGLUT1 as well as potentially novel molecular targets for substance abuse. Through a systems biology based approach our results illustrate the utility of integrating epigenetic and transcript expression to establish relevant biological networks in the human brain for addiction. Future work with laboratory models may clarify the functional relevance of these gene networks for cocaine and alcohol, and provide a framework for the development of medications for the treatment of addiction.

  7. Diversification in the genetic architecture of gene expression and transcriptional networks in organ differentiation of Populus.

    Science.gov (United States)

    Drost, Derek R; Benedict, Catherine I; Berg, Arthur; Novaes, Evandro; Novaes, Carolina R D B; Yu, Qibin; Dervinis, Christopher; Maia, Jessica M; Yap, John; Miles, Brianna; Kirst, Matias

    2010-05-04

    A fundamental goal of systems biology is to identify genetic elements that contribute to complex phenotypes and to understand how they interact in networks predictive of system response to genetic variation. Few studies in plants have developed such networks, and none have examined their conservation among functionally specialized organs. Here we used genetical genomics in an interspecific hybrid population of the model hardwood plant Populus to uncover transcriptional networks in xylem, leaves, and roots. Pleiotropic eQTL hotspots were detected and used to construct coexpression networks a posteriori, for which regulators were predicted based on cis-acting expression regulation. Networks were shown to be enriched for groups of genes that function in biologically coherent processes and for cis-acting promoter motifs with known roles in regulating common groups of genes. When contrasted among xylem, leaves, and roots, transcriptional networks were frequently conserved in composition, but almost invariably regulated by different loci. Similarly, the genetic architecture of gene expression regulation is highly diversified among plant organs, with less than one-third of genes with eQTL detected in two organs being regulated by the same locus. However, colocalization in eQTL position increases to 50% when they are detected in all three organs, suggesting conservation in the genetic regulation is a function of ubiquitous expression. Genes conserved in their genetic regulation among all organs are primarily cis regulated (approximately 92%), whereas genes with eQTL in only one organ are largely trans regulated. Trans-acting regulation may therefore be the primary driver of differentiation in function between plant organs.

  8. High Dimensional ODEs Coupled with Mixed-Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification.

    Science.gov (United States)

    Lu, Tao; Liang, Hua; Li, Hongzhe; Wu, Hulin

    2011-01-01

    Gene regulation is a complicated process. The interaction of many genes and their products forms an intricate biological network. Identification of this dynamic network will help us understand the biological process in a systematic way. However, the construction of such a dynamic network is very challenging for a high-dimensional system. In this article we propose to use a set of ordinary differential equations (ODE), coupled with dimensional reduction by clustering and mixed-effects modeling techniques, to model the dynamic gene regulatory network (GRN). The ODE models allow us to quantify both positive and negative gene regulations as well as feedback effects of one set of genes in a functional module on the dynamic expression changes of the genes in another functional module, which results in a directed graph network. A five-step procedure, Clustering, Smoothing, regulation Identification, parameter Estimates refining and Function enrichment analysis (CSIEF) is developed to identify the ODE-based dynamic GRN. In the proposed CSIEF procedure, a series of cutting-edge statistical methods and techniques are employed, that include non-parametric mixed-effects models with a mixture distribution for clustering, nonparametric mixed-effects smoothing-based methods for ODE models, the smoothly clipped absolute deviation (SCAD)-based variable selection, and stochastic approximation EM (SAEM) approach for mixed-effects ODE model parameter estimation. The key step, the SCAD-based variable selection of the proposed procedure is justified by investigating its asymptotic properties and validated by Monte Carlo simulations. We apply the proposed method to identify the dynamic GRN for yeast cell cycle progression data. We are able to annotate the identified modules through function enrichment analyses. Some interesting biological findings are discussed. The proposed procedure is a promising tool for constructing a general dynamic GRN and more complicated dynamic networks.

  9. Epigenetic modulation of brain gene networks for cocaine and alcohol abuse.

    Science.gov (United States)

    Farris, Sean P; Harris, Robert A; Ponomarev, Igor

    2015-01-01

    Cocaine and alcohol are two substances of abuse that prominently affect the central nervous system (CNS). Repeated exposure to cocaine and alcohol leads to longstanding changes in gene expression, and subsequent functional CNS plasticity, throughout multiple brain regions. Epigenetic modifications of histones are one proposed mechanism guiding these enduring changes to the transcriptome. Characterizing the large number of available biological relationships as network models can reveal unexpected biochemical relationships. Clustering analysis of variation from whole-genome sequencing of gene expression (RNA-Seq) and histone H3 lysine 4 trimethylation (H3K4me3) events (ChIP-Seq) revealed the underlying structure of the transcriptional and epigenomic landscape within hippocampal postmortem brain tissue of drug abusers and control cases. Distinct sets of interrelated networks for cocaine and alcohol abuse were determined for each abusive substance. The network approach identified subsets of functionally related genes that are regulated in agreement with H3K4me3 changes, suggesting cause and effect relationships between this epigenetic mark and gene expression. Gene expression networks consisted of recognized substrates for addiction, such as the dopamine- and cAMP-regulated neuronal phosphoprotein PPP1R1B/DARPP-32 and the vesicular glutamate transporter SLC17A7/VGLUT1 as well as potentially novel molecular targets for substance abuse. Through a systems biology based approach our results illustrate the utility of integrating epigenetic and transcript expression to establish relevant biological networks in the human brain for addiction. Future work with laboratory models may clarify the functional relevance of these gene networks for cocaine and alcohol, and provide a framework for the development of medications for the treatment of addiction.

  10. A general co-expression network-based approach to gene expression analysis: comparison and applications

    Directory of Open Access Journals (Sweden)

    Zhang Weixiong

    2010-02-01

    Full Text Available Abstract Background Co-expression network-based approaches have become popular in analyzing microarray data, such as for detecting functional gene modules. However, co-expression networks are often constructed by ad hoc methods, and network-based analyses have not been shown to outperform the conventional cluster analyses, partially due to the lack of an unbiased evaluation metric. Results Here, we develop a general co-expression network-based approach for analyzing both genes and samples in microarray data. Our approach consists of a simple but robust rank-based network construction method, a parameter-free module discovery algorithm and a novel reference network-based metric for module evaluation. We report some interesting topological properties of rank-based co-expression networks that are very different from that of value-based networks in the literature. Using a large set of synthetic and real microarray data, we demonstrate the superior performance of our approach over several popular existing algorithms. Applications of our approach to yeast, Arabidopsis and human cancer microarray data reveal many interesting modules, including a fatal subtype of lymphoma and a gene module regulating yeast telomere integrity, which were missed by the existing methods. Conclusions We demonstrated that our novel approach is very effective in discovering the modular structures in microarray data, both for genes and for samples. As the method is essentially parameter-free, it may be applied to large data sets where the number of clusters is difficult to estimate. The method is also very general and can be applied to other types of data. A MATLAB implementation of our algorithm can be downloaded from http://cs.utsa.edu/~jruan/Software.html.

  11. Prediction and validation of gene-disease associations using methods inspired by social network analyses.

    Directory of Open Access Journals (Sweden)

    U Martin Singh-Blom

    Full Text Available Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques, is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall [corrected].

  12. Network analysis of differential expression for the identification of disease-causing genes.

    Directory of Open Access Journals (Sweden)

    Daniela Nitsch

    Full Text Available Genetic studies (in particular linkage and association studies identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved. We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes.

  13. Chronic ethanol exposure produces time- and brain region-dependent changes in gene coexpression networks.

    Directory of Open Access Journals (Sweden)

    Elizabeth A Osterndorff-Kahanek

    Full Text Available Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY, nucleus accumbens (NAC, prefrontal cortex (PFC, and liver after four weekly cycles of chronic intermittent ethanol (CIE vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000 at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600. Within each region, there was little gene overlap across time (~20%. All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global 'rewiring' of coexpression systems involving glial and immune signaling as well as neuronal genes.

  14. Gene-metabolite network analysis in different nonalcoholic fatty liver disease phenotypes

    Science.gov (United States)

    Liu, Xiao-Lin; Ming, Ya-Nan; Zhang, Jing-Yi; Chen, Xiao-Yu; Zeng, Min-De; Mao, Yi-Min

    2017-01-01

    We sought to identify common key regulators and build a gene-metabolite network in different nonalcoholic fatty liver disease (NAFLD) phenotypes. We used a high-fat diet (HFD), a methionine-choline-deficient diet (MCDD) and streptozocin (STZ) to establish nonalcoholic fatty liver (NAFL), nonalcoholic steatohepatitis (NASH) and NAFL+type 2 diabetes mellitus (T2DM) in rat models, respectively. Transcriptomics and metabolomics analyses were performed in rat livers and serum. A functional network-based regulation model was constructed using Cytoscape with information derived from transcriptomics and metabolomics. The results revealed that 96 genes, 17 liver metabolites and 4 serum metabolites consistently changed in different NAFLD phenotypes (>2-fold, PGene-metabolite network analysis identified ccl2 and jun as hubs with the largest connections to other genes, which were mainly involved in tumor necrosis factor, P53, nuclear factor-kappa B, chemokine, peroxisome proliferator activated receptor and Toll-like receptor signaling pathways. The specifically regulated genes and metabolites in different NAFLD phenotypes constructed their own networks, which were mainly involved in the lipid and fatty acid metabolism in HFD models, the inflammatory and immune response in MCDD models, and the AMPK signaling pathway and response to insulin in HFD+STZ models. Our study identified networks showing the general and specific characteristics in different NAFLD phenotypes, complementing the genetic and metabolic features in NAFLD with hepatic and extra-hepatic manifestations. PMID:28082742

  15. iSLIM: a comprehensive approach to mapping and characterizing gene regulatory networks.

    Science.gov (United States)

    Rockel, Sylvie; Geertz, Marcel; Hens, Korneel; Deplancke, Bart; Maerkl, Sebastian J

    2013-02-01

    Mapping gene regulatory networks is a significant challenge in systems biology, yet only a few methods are currently capable of systems-level identification of transcription factors (TFs) that bind a specific regulatory element. We developed a microfluidic method for integrated systems-level interaction mapping of TF-DNA interactions, generating and interrogating an array of 423 full-length Drosophila TFs. With integrated systems-level interaction mapping, it is now possible to rapidly and quantitatively map gene regulatory networks of higher eukaryotes.

  16. Comparative analysis of the transcription-factor gene regulatory networks of E. coli and S. cerevisiae

    Directory of Open Access Journals (Sweden)

    Santillán Moisés

    2008-01-01

    Full Text Available Abstract Background The regulatory interactions between transcription factors (TF and regulated genes (RG in a species genome can be lumped together in a single directed graph. The TF's and RG's conform the nodes of this graph, while links are drawn whenever a transcription factor regulates a gene's expression. Projections onto TF nodes can be constructed by linking every two nodes regulating a common gene. Similarly, projections onto RG nodes can be made by linking every two regulated genes sharing at least one common regulator. Recent studies of the connectivity pattern in the transcription-factor regulatory network of many organisms have revealed some interesting properties. However, the differences between TF and RG nodes have not been widely explored. Results After analysing the RG and TF projections of the transcription-factor gene regulatory networks of Escherichia coli and Saccharomyces cerevisiae, we found several common characteristic as well as some noticeable differences. To better understand these differences, we compared the properties of the E. coli and S. cerevisiae RG- and TF-projected networks with those of the corresponding projections built from randomized versions of the original bipartite networks. These last results indicate that the observed differences are mostly due to the very different ratios of TF to RG counts of the E. coli and S. cerevisiae bipartite networks, rather than to their having different connectivity patterns. Conclusion Since E. coli is a prokaryotic organism while S. cerevisiae is eukaryotic, there are important differences between them concerning processing of mRNA before translation, DNA packing, amount of junk DNA, and gene regulation. From the results in this paper we conclude that the most important effect such differences have had on the development of the corresponding transcription-factor gene regulatory networks is their very different ratios of TF to RG numbers. This ratio is more than three

  17. Fractal gene regulatory networks for robust locomotion control of modular robots

    DEFF Research Database (Denmark)

    Zahadat, Payam; Christensen, David Johan; Schultz, Ulrik Pagh;

    2010-01-01

    Designing controllers for modular robots is difficult due to the distributed and dynamic nature of the robots. In this paper fractal gene regulatory networks are evolved to control modular robots in a distributed way. Experiments with different morphologies of modular robot are performed and the ......Designing controllers for modular robots is difficult due to the distributed and dynamic nature of the robots. In this paper fractal gene regulatory networks are evolved to control modular robots in a distributed way. Experiments with different morphologies of modular robot are performed...

  18. Sensor-coupled fractal gene regulatory networks for locomotion control of a modular snake robot

    DEFF Research Database (Denmark)

    Zahadat, Payam; Christensen, David Johan; Katebi, Serajeddin

    2013-01-01

    In this paper we study fractal gene regulatory network (FGRN) controllers based on sensory information. The FGRN controllers are evolved to control a snake robot consisting of seven simulated ATRON modules. Each module contains three tilt sensors which represent the direction of gravity in the co......In this paper we study fractal gene regulatory network (FGRN) controllers based on sensory information. The FGRN controllers are evolved to control a snake robot consisting of seven simulated ATRON modules. Each module contains three tilt sensors which represent the direction of gravity...

  19. An improved, bias-reduced probabilistic functional gene network of baker's yeast, Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Insuk Lee

    Full Text Available BACKGROUND: Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations. METHODOLOGY/PRINCIPAL FINDINGS: We report a significantly improved version (v. 2 of a probabilistic functional gene network of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis. CONCLUSIONS/SIGNIFICANCE: YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome. YeastNet is available from http://www.yeastnet.org.

  20. Identification of therapeutic targets for Alzheimer's disease via differentially expressed gene and weighted gene co-expression network analyses.

    Science.gov (United States)

    Jia, Yujie; Nie, Kun; Li, Jing; Liang, Xinyue; Zhang, Xuezhu

    2016-11-01

    In order to investigate the pathogenic targets and associated biological process of Alzheimer's disease in the present study, mRNA expression profiles (GSE28146) and microRNA (miRNA) expression profiles (GSE16759) were downloaded from the Gene Expression Omnibus database. In GSE28146, eight control samples, and Alzheimer's disease samples comprising seven incipient, eight moderate, seven severe Alzheimer's disease samples, were included. The Affy package in R was used for background correction and normalization of the raw microarray data. The differentially expressed genes (DEGs) and differentially expressed miRNAs were identified using the Limma package. In addition, mRNAs were clustered using weighted gene correlation network analysis, and modules found to be significantly associated with the stages of Alzheimer's disease were screened out. The Database for Annotation, Visualization, and Integrated Discovery was used to perform Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses. The target genes of the differentially expressed miRNAs were identified using the miRWalk database. Compared with the control samples, 175,59 genes and 90 DEGs were identified in the incipient, moderate and severe Alzheimer's disease samples, respectively. A module, which contained 1,592 genes was found to be closely associated with the stage of Alzheimer's disease and biological processes. In addition, pathways associated with Alzheimer's disease and other neurological diseases were found to be enriched in those genes. A total of 139 overlapped genes were identified between those genes and the DEGs in the three groups. From the miRNA expression profiles, 189 miRNAs were found differentially expressed in the samples from patients with Alzheimer's disease and 1,647 target genes were obtained. In addition, five overlapped genes were identified between those 1,647 target genes and the 139 genes, and these genes may be important pathogenic targets for Alzheimer

  1. MS/MS networking guided analysis of molecule and gene cluster families.

    Science.gov (United States)

    Nguyen, Don Duy; Wu, Cheng-Hsuan; Moree, Wilna J; Lamsa, Anne; Medema, Marnix H; Zhao, Xiling; Gavilan, Ronnie G; Aparicio, Marystella; Atencio, Librada; Jackson, Chanaye; Ballesteros, Javier; Sanchez, Joel; Watrous, Jeramie D; Phelan, Vanessa V; van de Wiel, Corine; Kersten, Roland D; Mehnaz, Samina; De Mot, René; Shank, Elizabeth A; Charusanti, Pep; Nagarajan, Harish; Duggan, Brendan M; Moore, Bradley S; Bandeira, Nuno; Palsson, Bernhard Ø; Pogliano, Kit; Gutiérrez, Marcelino; Dorrestein, Pieter C

    2013-07-09

    The ability to correlate the production of specialized metabolites to the genetic capacity of the organism that produces such molecules has become an invaluable tool in aiding the discovery of biotechnologically applicable molecules. Here, we accomplish this task by matching molecular families with gene cluster families, making these correlations to 60 microbes at one time instead of connecting one molecule to one organism at a time, such as how it is traditionally done. We can correlate these families through the use of nanospray desorption electrospray ionization MS/MS, an ambient pressure MS technique, in conjunction with MS/MS networking and peptidogenomics. We matched the molecular families of peptide natural products produced by 42 bacilli and 18 pseudomonads through the generation of amino acid sequence tags from MS/MS data of specific clusters found in the MS/MS network. These sequence tags were then linked to biosynthetic gene clusters in publicly accessible genomes, providing us with the ability to link particular molecules with the genes that produced them. As an example of its use, this approach was applied to two unsequenced Pseudoalteromonas species, leading to the discovery of the gene cluster for a molecular family, the bromoalterochromides, in the previously sequenced strain P. piscicida JCM 20779(T). The approach itself is not limited to 60 related strains, because spectral networking can be readily adopted to look at molecular family-gene cluster families of hundreds or more diverse organisms in one single MS/MS network.

  2. DNA-Binding Kinetics Determines the Mechanism of Noise-Induced Switching in Gene Networks.

    Science.gov (United States)

    Tse, Margaret J; Chu, Brian K; Roy, Mahua; Read, Elizabeth L

    2015-10-20

    Gene regulatory networks are multistable dynamical systems in which attractor states represent cell phenotypes. Spontaneous, noise-induced transitions between these states are thought to underlie critical cellular processes, including cell developmental fate decisions, phenotypic plasticity in fluctuating environments, and carcinogenesis. As such, there is increasing interest in the development of theoretical and computational approaches that can shed light on the dynamics of these stochastic state transitions in multistable gene networks. We applied a numerical rare-event sampling algorithm to study transition paths of spontaneous noise-induced switching for a ubiquitous gene regulatory network motif, the bistable toggle switch, in which two mutually repressive genes compete for dominant expression. We find that the method can efficiently uncover detailed switching mechanisms that involve fluctuations both in occupancies of DNA regulatory sites and copy numbers of protein products. In addition, we show that the rate parameters governing binding and unbinding of regulatory proteins to DNA strongly influence the switching mechanism. In a regime of slow DNA-binding/unbinding kinetics, spontaneous switching occurs relatively frequently and is driven primarily by fluctuations in DNA-site occupancies. In contrast, in a regime of fast DNA-binding/unbinding kinetics, switching occurs rarely and is driven by fluctuations in levels of expressed protein. Our results demonstrate how spontaneous cell phenotype transitions involve collective behavior of both regulatory proteins and DNA. Computational approaches capable of simulating dynamics over many system variables are thus well suited to exploring dynamic mechanisms in gene networks.

  3. Creative Ventures: Ancient Civilizations.

    Science.gov (United States)

    Stark, Rebecca

    The open-ended activities in this book are designed to extend the imagination and creativity of students and encourage students to examine their feelings and values about historic eras. Civilizations addressed include ancient Egypt, Greece, Rome, Mayan, Stonehenge, and Mesopotamia. The activities focus upon the cognitive and affective pupil…

  4. Ancient Egypt: Personal Perspectives.

    Science.gov (United States)

    Wolinski, Arelene

    This teacher resource book provides information on ancient Egypt via short essays, photographs, maps, charts, and drawings. Egyptian social and religious life, including writing, art, architecture, and even the practice of mummification, is conveniently summarized for the teacher or other practitioner in a series of one to three page articles with…

  5. Cloning Ancient Trees

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    west of Tiananmen Square in Beijing, in Zhongshan Park, there stand several ancient cypress trees, each more than 1,000 years old. Their leafy crowns are all more than 20 meters high, while four have trunks that are 6 meters in circumference. The most unique of these

  6. Ancient ports of Kalinga

    Digital Repository Service at National Institute of Oceanography (India)

    Tripati, S.

    The ancient Kingdom of Kalinga mentioned in the Hathigumpha inscription of Kharavela (1st century B.C.) extended from the mouths of the Ganges to the estuary of Godavari river on the East Coast. Ptolemy (100 A.D.) mentions that Paluru (District...

  7. Ancient deforestation revisited.

    Science.gov (United States)

    Hughes, J Donald

    2011-01-01

    The image of the classical Mediterranean environment of the Greeks and Romans had a formative influence on the art, literature, and historical perception of modern Europe and America. How closely does is this image congruent with the ancient environment as it in reality existed? In particular, how forested was the ancient Mediterranean world, was there deforestation, and if so, what were its effects? The consensus of historians, geographers, and other scholars from the mid-nineteenth century through the first three quarters of the twentieth century was that human activities had depleted the forests to a major extent and caused severe erosion. My research confirmed this general picture. Since then, revisionist historians have questioned these conclusions, maintaining instead that little environmental damage was done to forests and soils in ancient Greco-Roman times. In a reconsideration of the question, this paper looks at recent scientific work providing proxy evidence for the condition of forests at various times in ancient history. I look at three scientific methodologies, namely anthracology, palynology, and computer modeling. Each of these avenues of research offers support for the concept of forest change, both in abundance and species composition, and episodes of deforestation and erosion, and confirms my earlier work.

  8. Printing Ancient Terracotta Warriors

    Science.gov (United States)

    Gadecki, Victoria L.

    2010-01-01

    Standing in awe in Xian, China, at the Terra Cotta warrior archaeological site, the author thought of sharing this experience and excitement with her sixth-grade students. She decided to let her students carve patterns of the ancient soldiers to understand their place in Chinese history. They would make block prints and print multiple soldiers on…

  9. Chemical-gene interaction networks and causal reasoning for ...

    Science.gov (United States)

    Evaluating the potential human health and ecological risks associated with exposures to complex chemical mixtures in the environment is one of the main challenges of chemical safety assessment and environmental protection. There is a need for approaches that can help to integrate chemical monitoring and biological effects data to evaluate risks associated with chemicals present in the environment. Here, we used prior knowledge about chemical-gene interactions to develop a knowledge assembly model for detected chemicals at five locations near the North Branch and Chisago wastewater treatment plants (WWTP) in the St. Croix River Basin, MN and WI. The assembly model was used to generate hypotheses about the biological impacts of the chemicals at each location. The hypotheses were tested using empirical hepatic gene expression data from fathead minnows exposed for 12 d at each location. Empirical gene expression data were also mapped to the assembly models to evaluate the likelihood of a chemical contributing to the observed biological responses using richness and concordance statistics. The prior knowledge approach was able predict the observed biological pathways impacted at one site but not the other. Atrazine was identified as a potential contributor to the observed gene expression responses at a location upstream of the North Branch WTTP. Four chemicals were identified as contributors to the observed biological responses at the effluent and downstream o

  10. How Molecular Competition Influences Fluxes in Gene Expression Networks

    NARCIS (Netherlands)

    De Vos, Dirk; Bruggeman, Frank J.; Westerhoff, Hans V.; Bakker, Barbara M.

    2011-01-01

    Often, in living cells different molecular species compete for binding to the same molecular target. Typical examples are the competition of genes for the transcription machinery or the competition of mRNAs for the translation machinery. Here we show that such systems have specific regulatory featur

  11. How molecular competition influences fluxes in gene expression networks

    NARCIS (Netherlands)

    Vos, D. de; Bruggeman, F.J.; Westerhoff, H.V.; Bakker, B.M.

    2011-01-01

    Often, in living cells different molecular species compete for binding to the same molecular target. Typical examples are the competition of genes for the transcription machinery or the competition of mRNAs for the translation machinery. Here we show that such systems have specific regulatory featur

  12. PyPanda: a Python package for gene regulatory network reconstruction

    Science.gov (United States)

    van IJzendoorn, David G.P.; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L.

    2016-01-01

    Summary: PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of ‘omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. Availability and implementation: The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda. Contact: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl PMID:27402905

  13. The Genome-Wide Interaction Network of Nutrient Stress Genes in Escherichia coli

    Directory of Open Access Journals (Sweden)

    Jean-Philippe Côtôé

    2016-11-01

    Full Text Available Conventional efforts to describe essential genes in bacteria have typically emphasized nutrient-rich growth conditions. Of note, however, are the set of genes that become essential when bacteria are grown under nutrient stress. For example, more than 100 genes become indispensable when the model bacterium Escherichia coli is grown on nutrient-limited media, and many of these nutrient stress genes have also been shown to be important for the growth of various bacterial pathogens in vivo. To better understand the genetic network that underpins nutrient stress in E. coli, we performed a genome-scale cross of strains harboring deletions in some 82 nutrient stress genes with the entire E. coli gene deletion collection (Keio to create 315,400 double deletion mutants. An analysis of the growth of the resulting strains on rich microbiological media revealed an average of 23 synthetic sick or lethal genetic interactions for each nutrient stress gene, suggesting that the network defining nutrient stress is surprisingly complex. A vast majority of these interactions involved genes of unknown function or genes of unrelated pathways. The most profound synthetic lethal interactions were between nutrient acquisition and biosynthesis. Further, the interaction map reveals remarkable metabolic robustness in E. coli through pathway redundancies. In all, the genetic interaction network provides a powerful tool to mine and identify missing links in nutrient synthesis and to further characterize genes of unknown function in E. coli. Moreover, understanding of bacterial growth under nutrient stress could aid in the development of novel antibiotic discovery platforms.

  14. "Every Gene Is Everywhere but the Environment Selects": Global Geolocalization of Gene Sharing in Environmental Samples through Network Analysis.

    Science.gov (United States)

    Fondi, Marco; Karkman, Antti; Tamminen, Manu V; Bosi, Emanuele; Virta, Marko; Fani, Renato; Alm, Eric; McInerney, James O

    2016-05-13

    The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as "everything is everywhere but the environment selects." While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge.

  15. Targeting single neuronal networks for gene expression and cell labeling in vivo.

    Science.gov (United States)

    Marshel, James H; Mori, Takuma; Nielsen, Kristina J; Callaway, Edward M

    2010-08-26

    To understand fine-scale structure and function of single mammalian neuronal networks, we developed and validated a strategy to genetically target and trace monosynaptic inputs to a single neuron in vitro and in vivo. The strategy independently targets a neuron and its presynaptic network for specific gene expression and fine-scale labeling, using single-cell electroporation of DNA to target infection and monosynaptic retrograde spread of a genetically modifiable rabies virus. The technique is highly reliable, with transsynaptic labeling occurring in every electroporated neuron infected by the virus. Targeting single neocortical neuronal networks in vivo, we found clusters of both spiny and aspiny neurons surrounding the electroporated neuron in each case, in addition to intricately labeled distal cortical and subcortical inputs. This technique, broadly applicable for probing and manipulating single neuronal networks with single-cell resolution in vivo, may help shed new light on fundamental mechanisms underlying circuit development and information processing by neuronal networks throughout the brain.

  16. Intrinsic noise and deviations from criticality in Boolean gene-regulatory networks

    Science.gov (United States)

    Villegas, Pablo; Ruiz-Franco, José; Hidalgo, Jorge; Muñoz, Miguel A.

    2016-01-01

    Gene regulatory networks can be successfully modeled as Boolean networks. A much discussed hypothesis says that such model networks reproduce empirical findings the best if they are tuned to operate at criticality, i.e. at the borderline between their ordered and disordered phases. Critical networks have been argued to lead to a number of functional advantages such as maximal dynamical range, maximal sensitivity to environmental changes, as well as to an excellent tradeoff between stability and flexibility. Here, we study the effect of noise within the context of Boolean networks trained to learn complex tasks under supervision. We verify that quasi-critical networks are the ones learning in the fastest possible way –even for asynchronous updating rules– and that the larger the task complexity the smaller the distance to criticality. On the other hand, when additional sources of intrinsic noise in the network states and/or in its wiring pattern are introduced, the optimally performing networks become clearly subcritical. These results suggest that in order to compensate for inherent stochasticity, regulatory and other type of biological networks might become subcritical rather than being critical, all the most if the task to be performed has limited complexity. PMID:27713479

  17. Intrinsic noise and deviations from criticality in Boolean gene-regulatory networks

    Science.gov (United States)

    Villegas, Pablo; Ruiz-Franco, José; Hidalgo, Jorge; Muñoz, Miguel A.

    2016-10-01

    Gene regulatory networks can be successfully modeled as Boolean networks. A much discussed hypothesis says that such model networks reproduce empirical findings the best if they are tuned to operate at criticality, i.e. at the borderline between their ordered and disordered phases. Critical networks have been argued to lead to a number of functional advantages such as maximal dynamical range, maximal sensitivity to environmental changes, as well as to an excellent tradeoff between stability and flexibility. Here, we study the effect of noise within the context of Boolean networks trained to learn complex tasks under supervision. We verify that quasi-critical networks are the ones learning in the fastest possible way –even for asynchronous updating rules– and that the larger the task complexity the smaller the distance to criticality. On the other hand, when additional sources of intrinsic noise in the network states and/or in its wiring pattern are introduced, the optimally performing networks become clearly subcritical. These results suggest that in order to compensate for inherent stochasticity, regulatory and other type of biological networks might become subcritical rather than being critical, all the most if the task to be performed has limited complexity.

  18. Global analysis of phase locking in gene expression during cell cycle: the potential in network modeling

    Directory of Open Access Journals (Sweden)

    Hessner Martin J

    2010-12-01

    Full Text Available Abstract Background In nonlinear dynamic systems, synchrony through oscillation and frequency modulation is a general control strategy to coordinate multiple modules in response to external signals. Conversely, the synchrony information can be utilized to infer interaction. Increasing evidence suggests that frequency modulation is also common in transcription regulation. Results In this study, we investigate the potential of phase locking analysis, a technique to study the synchrony patterns, in the transcription network modeling of time course gene expression data. Using the yeast cell cycle data, we show that significant phase locking exists between transcription factors and their targets, between gene pairs with prior evidence of physical or genetic interactions, and among cell cycle genes. When compared with simple correlation we found that the phase locking metric can identify gene pairs that interact with each other more efficiently. In addition, it can automatically address issues of arbitrary time lags or different dynamic time scales in different genes, without the need for alignment. Interestingly, many of the phase locked gene pairs exhibit higher order than 1:1 locking, and significant phase lags with respect to each other. Based on these findings we propose a new phase locking metric for network reconstruction using time course gene expression data. We show that it is efficient at identifying network modules of focused biological themes that are important to cell cycle regulation. Conclusions Our result demonstrates the potential of phase locking analysis in transcription network modeling. It also suggests the importance of understanding the dynamics underlying the gene expression patterns.

  19. Snapshot of iron response in Shewanella oneidensis by gene network reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Yunfeng; Harris, Daniel P.; Luo, Feng; Xiong, Wenlu; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin; Palumbo, Anthony V.; Arkin, Adam P.; Zhou, Jizhong

    2008-10-09

    Background: Iron homeostasis of Shewanella oneidensis, a gamma-proteobacterium possessing high iron content, is regulated by a global transcription factor Fur. However, knowledge is incomplete about other biological pathways that respond to changes in iron concentration, as well as details of the responses. In this work, we integrate physiological, transcriptomics and genetic approaches to delineate the iron response of S. oneidensis. Results: We show that the iron response in S. oneidensis is a rapid process. Temporal gene expression profiles were examined for iron depletion and repletion, and a gene co-expression network was reconstructed. Modules of iron acquisition systems, anaerobic energy metabolism and protein degradation were the most noteworthy in the gene network. Bioinformatics analyses suggested that genes in each of the modules might be regulated by DNA-binding proteins Fur, CRP and RpoH, respectively. Closer inspection of these modules revealed a transcriptional regulator (SO2426) involved in iron acquisition and ten transcriptional factors involved in anaerobic energy metabolism. Selected genes in the network were analyzed by genetic studies. Disruption of genes encoding a putative alcaligin biosynthesis protein (SO3032) and a gene previously implicated in protein degradation (SO2017) led to severe growth deficiency under iron depletion conditions. Disruption of a novel transcriptional factor (SO1415) caused deficiency in both anaerobic iron reduction and growth with thiosulfate or TMAO as an electronic acceptor, suggesting that SO1415 is required for specific branches of anaerobic energy metabolism pathways. Conclusions: Using a reconstructed gene network, we identified major biological pathways that were differentially expressed during iron depletion and repletion. Genetic studies not only demonstrated the importance of iron acquisition and protein degradation for iron depletion, but also characterized a novel transcriptional factor (SO1415) with a

  20. Refining Dynamics of Gene Regulatory Networks in a Stochastic π-Calculus Framework

    OpenAIRE

    Paulevé, Loïc; Magnin, Morgan; Roux, Olivier

    2011-01-01

    International audience; In this paper, we introduce a framework allowing to model and analyse efficiently Gene Regulatory Networks in their temporal and stochastic aspects. The analysis of stable states and inference of René Thomas' discrete parameters derives from this logical formalism. We offer a compositional approach which comes with a natural translation to the Stochastic π-Calculus. The method we propose consists in successive refinements of generalized dynamics of Gene Regulatory Netw...

  1. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    OpenAIRE

    Chen, Yang; Xu, Rong

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cros...

  2. Elucidating gene function and function evolution through comparison of co-expression networks in plants

    Directory of Open Access Journals (Sweden)

    Marek eMutwil

    2014-08-01

    Full Text Available The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23. In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We show that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that, in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution.

  3. Conservation and diversification of an ancestral chordate gene regulatory network for dorsoventral patterning.

    Directory of Open Access Journals (Sweden)

    Iryna Kozmikova

    Full Text Available Formation of a dorsoventral axis is a key event in the early development of most animal embryos. It is well established that bone morphogenetic proteins (Bmps and Wnts are key mediators of dorsoventral patterning in vertebrates. In the cephalochordate amphioxus, genes encoding Bmps and transcription factors downstream of Bmp signaling such as Vent are expressed in patterns reminiscent of those of their vertebrate orthologues. However, the key question is whether the conservation of expression patterns of network constituents implies conservation of functional network interactions, and if so, how an increased functional complexity can evolve. Using heterologous systems, namely by reporter gene assays in mammalian cell lines and by transgenesis in medaka fish, we have compared the gene regulatory network implicated in dorsoventral patterning of the basal chordate amphioxus and vertebrates. We found that Bmp but not canonical Wnt signaling regulates promoters of genes encoding homeodomain proteins AmphiVent1 and AmphiVent2. Furthermore, AmphiVent1 and AmphiVent2 promoters appear to be correctly regulated in the context of a vertebrate embryo. Finally, we show that AmphiVent1 is able to directly repress promoters of AmphiGoosecoid and AmphiChordin genes. Repression of genes encoding dorsal-specific signaling molecule Chordin and transcription factor Goosecoid by Xenopus and zebrafish Vent genes represents a key regulatory interaction during vertebrate axis formation. Our data indicate high evolutionary conservation of a core Bmp-triggered gene regulatory network for dorsoventral patterning in chordates and suggest that co-option of the canonical Wnt signaling pathway for dorsoventral patterning in vertebrates represents one of the innovations through which an increased morphological complexity of vertebrate embryo is achieved.

  4. Enrichment of brain-related genes on the mammalian X chromosome is ancient and predates the divergence of synapsid and sauropsid lineages.

    Science.gov (United States)

    Kemkemer, Claus; Kohn, Matthias; Kehrer-Sawatzki, Hildegard; Fundele, Reinald H; Hameister, Horst

    2009-01-01

    Previous studies have revealed an enrichment of reproduction- and brain-related genes on the human X chromosome. In the present study, we investigated the evolutionary history that underlies this functional specialization. To do so, we analyzed the orthologous building blocks of the mammalian X chromosome in the chicken genome. We used Affymetrix chicken genome microarrays to determine tissue-selective gene expression in several tissues of the chicken, including testis and brain. Subsequently, chromosomal distribution of genes with tissue-selective expression was determined. These analyzes provided several new findings. Firstly, they showed that chicken chromosomes orthologous to the mammalian X chromosome exhibited an increased concentration of genes expressed selectively in brain. More specifically, the highest concentration of brain-selectively expressed genes was found on chicken chromosome GGA12, which shows orthology to the X chromosomal regions with the highest enrichment of non-syndromic X-linked mental retardation (MRX) genes. Secondly, and in contrast to the first finding, no enrichment of testis-selective genes could be detected on these chicken chromosomes. These findings indicate that the accumulation of brain-related genes on the prospective mammalian X chromosome antedates the divergence of sauropsid and synapsid lineages 315 million years ago, whereas the accumulation of testis-related genes on the mammalian X chromosome is more recent and due to adaptational changes.

  5. Bottleneck genes and community structure in the cell cycle network of S. pombe.

    Directory of Open Access Journals (Sweden)

    Cécile Caretta-Cartozo

    2007-06-01

    Full Text Available The identification of cell cycle-related genes is still a difficult task, even for organisms with relatively few genes such as the fission yeast. Several gene expression studies have been published on S. pombe showing similarities but also discrepancies in their results. We introduce a network in which the weight of each link is a function of the phase difference between the expression peaks of two genes. The analysis of the stability of the clustering through the computation of an entropy parameter reveals a structure made of four clusters, the first one corresponding to a robustly connected M-G1 component, the second to genes in the S phase, and the third and fourth to two G2 components. They are separated by bottleneck structures that appear to correspond to cell cycle checkpoints. We identify a number of genes that are located on these bottlenecks. They represent a novel group of cell cycle regulatory genes. They all show interesting functions, and they are supposed to be involved in the regulation of the transition from one phase to the next. We therefore present a comparison of the available studies on the fission yeast cell cycle and a general statistical bioinformatics methodology to find bottlenecks and gene community structures based on recent developments in network theory.

  6. Identification of hepatocellular carcinoma-related genes with a machine learning and network analysis.

    Science.gov (United States)

    Gui, Tuantuan; Dong, Xiao; Li, Rudong; Li, Yixue; Wang, Zhen

    2015-01-01

    Liver cancer is one of the leading causes of cancer mortality worldwide. Hepatocellular carcinoma (HCC) is the main type of liver cancer. We applied a machine learning approach with maximum-relevance-minimum-redundancy (mRMR) algorithm followed by incremental feature selection (IFS) to a set of microarray data generated from 43 tumor and 52 nontumor samples. With the machine learning approach, we identified 117 gene probes that could optimally separate tumor and nontumor samples. These genes not only include known HCC-relevant genes such as MT1X, BMI1, and CAP2, but also include cancer genes that were not found previously to be closely related to HCC, such as TACSTD2. Then, we constructed a molecular interaction network based on the protein-protein interaction (PPI) data from the STRING database and identified 187 genes on the shortest paths among the genes identified with the machine learning approach. Network analysis reveals new potential roles of ubiquitin C in the pathogenesis of HCC. Based on gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we showed that the identified subnetwork is significantly enriched in biological processes related to cell death. These results bring new insights of understanding the process of HCC.

  7. Reverse Engineering Sparse Gene Regulatory Networks Using Cubature Kalman Filter and Compressed Sensing

    Directory of Open Access Journals (Sweden)

    Amina Noor

    2013-01-01

    Full Text Available This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF and Kalman filter (KF techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability.

  8. A pathway-based network analysis of hypertension-related genes

    Science.gov (United States)

    Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

    2016-02-01

    Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.

  9. Structural influence of gene networks on their inference: analysis of C3NET

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2011-06-01

    Full Text Available Abstract Background The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. Results In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. Conclusions The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods. Reviewers This article was reviewed by Lev Klebanov, Joel Bader and Yuriy Gusev.

  10. A Hh-driven gene network controls specification, pattern and size of the Drosophila simple eyes.

    Science.gov (United States)

    Aguilar-Hidalgo, Daniel; Domínguez-Cejudo, María A; Amore, Gabriele; Brockmann, Anette; Lemos, María C; Córdoba, Antonio; Casares, Fernando

    2013-01-01

    During development, extracellular signaling molecules interact with intracellular gene networks to control the specification, pattern and size of organs. One such signaling molecule is Hedgehog (Hh). Hh is known to act as a morphogen, instructing different fates depending on the distance to its source. However, how Hh, when signaling across a cell field, impacts organ-specific transcriptional networks is still poorly understood. Here, we investigate this issue during the development of the Drosophila ocellar complex. The development of this sensory structure, which is composed of three simple eyes (or ocelli) located at the vertices of a triangular patch of cuticle on the dorsal head, depends on Hh signaling and on the definition of three domains: two areas of eya and so expression--the prospective anterior and posterior ocelli--and the intervening interocellar domain. Our results highlight the role of the homeodomain transcription factor engrailed (en) both as a target and as a transcriptional repressor of hh signaling in the prospective interocellar region. Furthermore, we identify a requirement for the Notch pathway in the establishment of en maintenance in a Hh-independent manner. Therefore, hh signals transiently during the specification of the interocellar domain, with en being required here for hh signaling attenuation. Computational analysis further suggests that this network design confers robustness to signaling noise and constrains phenotypic variation. In summary, using genetics and modeling we have expanded the ocellar gene network to explain how the interaction between the Hh gradient and this gene network results in the generation of stable mutually exclusive gene expression domains. In addition, we discuss some general implications our model may have in some Hh-driven gene networks.

  11. Mosaic gene network modelling identified new regulatory mechanisms in HCV infection.

    Science.gov (United States)

    Popik, Olga V; Petrovskiy, Evgeny D; Mishchenko, Elena L; Lavrik, Inna N; Ivanisenko, Vladimir A

    2016-06-15

    Modelling of gene networks is widely used in systems biology to study the functioning of complex biological systems. Most of the existing mathematical modelling techniques are useful for analysis of well-studied biological processes, for which information on rates of reactions is available. However, complex biological processes such as those determining the phenotypic traits of organisms or pathological disease processes, including pathogen-host interactions, involve complicated cross-talk between interacting networks. Furthermore, the intrinsic details of the interactions between these networks are often missing. In this study, we developed an approach, which we call mosaic network modelling, that allows the combination of independent mathematical models of gene regulatory networks and, thereby, description of complex biological systems. The advantage of this approach is that it allows us to generate the integrated model despite the fact that information on molecular interactions between parts of the model (so-called mosaic fragments) might be missing. To generate a mosaic mathematical model, we used control theory and mathematical models, written in the form of a system of ordinary differential equations (ODEs). In the present study, we investigated the efficiency of this method in modelling the dynamics of more than 10,000 simulated mosaic regulatory networks consisting of two pieces. Analysis revealed that this approach was highly efficient, as the mean deviation of the dynamics of mosaic network elements from the behaviour of the initial parts of the model was less than 10%. It turned out that for construction of the control functional, data on perturbation of one or two vertices of the mosaic piece are sufficient. Further, we used the developed method to construct a mosaic gene regulatory network including hepatitis C virus (HCV) as the first piece and the tumour necrosis factor (TNF)-induced apoptosis and NF-κB induction pathways as the second piece. Thus

  12. Gene identification for risk of relapse in stage I lung adenocarcinoma patients: a combined methodology of gene expression profiling and computational gene network analysis.

    Science.gov (United States)

    Ludovini, Vienna; Bianconi, Fortunato; Siggillino, Annamaria; Piobbico, Danilo; Vannucci, Jacopo; Metro, Giulio; Chiari, Rita; Bellezza, Guido; Puma, Francesco; Della Fazia, Maria Agnese; Servillo, Giuseppe; Crinò, Lucio

    2016-05-24

    Risk assessment and treatment choice remains a challenge in early non-small-cell lung cancer (NSCLC). The aim of this study was to identify novel genes involved in the risk of early relapse (ER) compared to no relapse (NR) in resected lung adenocarcinoma (AD) patients using a combination of high throughput technology and computational analysis. We identified 18 patients (n.13 NR and n.5 ER) with stage I AD. Frozen samples of patients in ER, NR and corresponding normal lung (NL) were subjected to Microarray technology and quantitative-PCR (Q-PCR). A gene network computational analysis was performed to select predictive genes. An independent set of 79 ADs stage I samples was used to validate selected genes by Q-PCR.From microarray analysis we selected 50 genes, using the fold change ratio of ER versus NR. They were validated both in pool and individually in patient samples (ER and NR) by Q-PCR. Fourteen increased and 25 decreased genes showed a concordance between two methods. They were used to perform a computational gene network analysis that identified 4 increased (HOXA10, CLCA2, AKR1B10, FABP3) and 6 decreased (SCGB1A1, PGC, TFF1, PSCA, SPRR1B and PRSS1) genes. Moreover, in an independent dataset of ADs samples, we showed that both high FABP3 expression and low SCGB1A1 expression was associated with a worse disease-free survival (DFS).Our results indicate that it is possible to define, through gene expression and computational analysis, a characteristic gene profiling of patients with an increased risk of relapse that may become a tool for patient selection for adjuvant therapy.

  13. Ancestral regulatory circuits governing ectoderm patterning downstream of Nodal and BMP2/4 revealed by gene regulatory network analysis in an echinoderm.

    Directory of Open Access Journals (Sweden)

    Alexandra Saudemont

    Full Text Available Echinoderms, which are phylogenetically related to vertebrates and produce large numbers of transparent embryos that can be experimentally manipulated, offer many advantages for the analysis of the gene regulatory networks (GRN regulating germ layer formation. During development of the sea urchin embryo, the ectoderm is the source of signals that pattern all three germ layers along the dorsal-ventral axis. How this signaling center controls patterning and morphogenesis of the embryo is not understood. Here, we report a large-scale analysis of the GRN deployed in response to the activity of this signaling center in the embryos of the Mediterranean sea urchin Paracentrotus lividus, in which studies with high spatial resolution are possible. By using a combination of in situ hybridization screening, overexpression of mRNA, recombinant ligand treatments, and morpholino-based loss-of-function studies, we identified a cohort of transcription factors and signaling molecules expressed in the ventral ectoderm, dorsal ectoderm, and interposed neurogenic ("ciliary band" region in response to the known key signaling molecules Nodal and BMP2/4 and defined the epistatic relationships between the most important genes. The resultant GRN showed a number of striking features. First, Nodal was found to be essential for the expression of all ventral and dorsal marker genes, and BMP2/4 for all dorsal genes. Second, goosecoid was identified as a central player in a regulatory sub-circuit controlling mouth formation, while tbx2/3 emerged as a critical factor for differentiation of the dorsal ectoderm. Finally, and unexpectedly, a neurogenic ectoderm regulatory circuit characterized by expression of "ciliary band" genes was triggered in the absence of TGF beta signaling. We propose a novel model for ectoderm regionalization, in which neural ectoderm is the default fate in the absence of TGF beta signaling, and suggest that the stomodeal and neural subcircuits that we

  14. Transcriptional networks driving enhancer function in the CFTR gene.

    Science.gov (United States)

    Kerschner, Jenny L; Harris, Ann

    2012-09-01

    A critical cis-regulatory element for the CFTR (cystic fibrosis transmembrane conductance regulator) gene is located in intron 11, 100 kb distal to the promoter, with which it interacts. This sequence contains an intestine-selective enhancer and associates with enhancer signature proteins, such as p300, in addition to tissue-specific TFs (transcription factors). In the present study we identify critical TFs that are recruited to this element and demonstrate their importance in regulating CFTR expression. In vitro DNase I footprinting and EMSAs (electrophoretic mobility-shift assays) identified four cell-type-selective regions that bound TFs in vitro. ChIP (chromatin immunoprecipitation) identified FOXA1/A2 (forkhead box A1/A2), HNF1 (hepatocyte nuclear factor 1) and CDX2 (caudal-type homeobox 2) as in vivo trans-interacting factors. Mutation of their binding sites in the intron 11 core compromised its enhancer activity when measured by reporter gene assay. Moreover, siRNA (small interfering RNA)-mediated knockdown of CDX2 caused a significant reduction in endogenous CFTR transcription in intestinal cells, suggesting that this factor is critical for the maintenance of high levels of CFTR expression in these cells. The ChIP data also demonstrate that these TFs interact with multiple cis-regulatory elements across the CFTR locus, implicating a more global role in intestinal expression of the gene.

  15. BisoGenet: a new tool for gene network building, visualization and analysis

    Directory of Open Access Journals (Sweden)

    Miranda Jamilet

    2010-02-01

    Full Text Available Abstract Background The increasing availability and diversity of omics data in the post-genomic era offers new perspectives in most areas of biomedical research. Graph-based biological networks models capture the topology of the functional relationships between molecular entities such as gene, protein and small compounds and provide a suitable framework for integrating and analyzing omics-data. The development of software tools capable of integrating data from different sources and to provide flexible methods to reconstruct, represent and analyze topological networks is an active field of research in bioinformatics. Results BisoGenet is a multi-tier application for visualization and analysis of biomolecular relationships. The system consists of three tiers. In the data tier, an in-house database stores genomics information, protein-protein interactions, protein-DNA interactions, gene ontology and metabolic pathways. In the middle tier, a global network is created at server startup, representing the whole data on bioentities and their relationships retrieved from the database. The client tier is a Cytoscape plugin, which manages user input, communication with the Web Service, visualization and analysis of the resulting network. Conclusion BisoGenet is able to build and visualize biological networks in a fast and user-friendly manner. A feature of Bisogenet is the possibility to include coding relations to distinguish between genes and their products. This feature could be instrumental to achieve a finer grain representation of the bioentities and their relationships. The client application includes network analysis tools and interactive network expansion capabilities. In addition, an option is provided to allow other networks to be converted to BisoGenet. This feature facilitates the integration of our software with other tools available in the Cytoscape platform. BisoGenet is available at http://bio.cigb.edu.cu/bisogenet-cytoscape/.

  16. Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Takeshi Hase

    Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

  17. Mouse Social Network Dynamics and Community Structure are Associated with Plasticity-Related Brain Gene Expression.

    Science.gov (United States)

    Williamson, Cait M; Franks, Becca; Curley, James P

    2016-01-01

    Laboratory studies of social behavior have typically focused on dyadic interactions occurring within a limited spatiotemporal context. However, this strategy prevents analyses of the dynamics of group social behavior and constrains identification of the biological pathways mediating individual differences in behavior. In the current study, we aimed to identify the spatiotemporal dynamics and hierarchical organization of a large social network of male mice. We also sought to determine if standard assays of social and exploratory behavior are predictive of social behavior in this social network and whether individual network position was associated with the mRNA expression of two plasticity-related genes, DNA methyltransferase 1 and 3a. Mice were observed to form a hierarchically organized social network and self-organized into two separate social network communities. Members of both communities exhibited distinct patterns of socio-spatial organization within the vivaria that was not limited to only agonistic interactions. We further established that exploratory and social behaviors in standard behavioral assays conducted prior to placing the mice into the large group was predictive of initial network position and behavior but were not associated with final social network position. Finally, we determined that social network position is associated with variation in mRNA levels of two neural plasticity genes, DNMT1 and DNMT3a, in the hippocampus but not the mPOA. This work demonstrates the importance of understanding the role of social context and complex social dynamics in determining the relationship between individual differences in social behavior and brain gene expression.

  18. An extended gene protein/products boolean network model including post-transcriptional regulation

    Science.gov (United States)

    2014-01-01

    Background Networks Biology allows the study of complex interactions between biological systems using formal, well structured, and computationally friendly models. Several different network models can be created, depending on the type of interactions that need to be investigated. Gene Regulatory Networks (GRN) are an effective model commonly used to study the complex regulatory mechanisms of a cell. Unfortunately, given their intrinsic complexity and non discrete nature, the computational study of realistic-sized complex GRNs requires some abstractions. Boolean Networks (BNs), for example, are a reliable model that can be used to represent networks where the possible state of a node is a boolean value (0 or 1). Despite this strong simplification, BNs have been used to study both structural and dynamic properties of real as well as randomly generated GRNs. Results In this paper we show how it is possible to include the post-transcriptional regulation mechanism (a key process mediated by small non-coding RNA molecules like the miRNAs) into the BN model of a GRN. The enhanced BN model is implemented in a software toolkit (EBNT) that allows to analyze boolean GRNs from both a structural and a dynamic point of view. The open-source toolkit is compatible with available visualization tools like Cytoscape and allows to run detailed analysis of the network topology as well as of its attractors, trajectories, and state-space. In the paper, a small GRN built around the mTOR gene is used to demonstrate the main capabilities of the toolkit. Conclusions The extended model proposed in this paper opens new opportunities in the study of gene regulation. Several of the successful researches done with the support of BN to understand high-level characteristics of regulatory networks, can now be improved to better understand the role of post-transcriptional regulation for example as a network-wide noise-reduction or stabilization mechanisms. PMID:25080304

  19. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

    Directory of Open Access Journals (Sweden)

    Xingli Guo

    Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

  20. Identification of genes and networks driving cardiovascular and metabolic phenotypes in a mouse F2 intercross.

    Directory of Open Access Journals (Sweden)

    Jonathan M J Derry

    Full Text Available To identify the genes and pathways that underlie cardiovascular and metabolic phenotypes we performed an integrated analysis of a mouse C57BL/6JxA/J F2 (B6AF2 cross by relating genome-wide gene expression data from adipose, kidney, and liver tissues to physiological endpoints measured in the population. We have identified a large number of trait QTLs including loci driving variation in cardiac function on chromosomes 2 and 6 and a hotspot for adiposity, energy metabolism, and glucose traits on chromosome 8. Integration of adipose gene expression data identified a core set of genes that drive the chromosome 8 adiposity QTL. This chromosome 8 trans eQTL signature contains genes associated with mitochondrial function and oxidative phosphorylation and maps to a subnetwork with conserved function in humans that was previously implicated in human obesity. In addition, human eSNPs corresponding to orthologous genes from the signature show enrichment for association to type II diabetes in the DIAGRAM cohort, supporting the idea that the chromosome 8 locus perturbs a molecular network that in humans senses variations in DNA and in turn affects metabolic disease risk. We functionally validate predictions from this approach by demonstrating metabolic phenotypes in knockout mice for three genes from the trans eQTL signature, Akr1b8, Emr1, and Rgs2. In addition we show that the transcriptional signatures for knockout of two of these genes, Akr1b8 and Rgs2, map to the F2 network modules associated with the chromosome 8 trans eQTL signature and that these modules are in turn very significantly correlated with adiposity in the F2 population. Overall this study demonstrates how integrating gene expression data with QTL analysis in a network-based framework can aid in the elucidation of the molecular drivers of disease that can be translated from mice to humans.

  1. Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2005-09-01

    Full Text Available Abstract Background Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain. Results We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories. Conclusion We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing

  2. An algorithm for network-based gene prioritization that encodes knowledge both in nodes and in links.

    Directory of Open Access Journals (Sweden)

    Chad Kimmel

    Full Text Available BACKGROUND: Candidate gene prioritization aims to identify promising new genes associated with a disease or a biological process from a larger set of candidate genes. In recent years, network-based methods - which utilize a knowledge network derived from biological knowledge - have been utilized for gene prioritization. Biological knowledge can be encoded either through the network's links or nodes. Current network-based methods can only encode knowledge through links. This paper describes a new network-based method that can encode knowledge in links as well as in nodes. RESULTS: We developed a new network inference algorithm called the Knowledge Network Gene Prioritization (KNGP algorithm which can incorporate both link and node knowledge. The performance of the KNGP algorithm was evaluated on both synthetic networks and on networks incorporating biological knowledge. The results showed that the combination of link knowledge and node knowledge provided a significant benefit across 19 experimental diseases over using link knowledge alone or node knowledge alone. CONCLUSIONS: The KNGP algorithm provides an advance over current network-based algorithms, because the algorithm can encode both link and node knowledge. We hope the algorithm will aid researchers with gene prioritization.

  3. The MYB98 subcircuit of the synergid gene regulatory network includes genes directly and indirectly regulated by MYB98.

    Science.gov (United States)

    Punwani, Jayson A; Rabiger, David S; Lloyd, Alan; Drews, Gary N

    2008-08-01

    The female gametophyte contains two synergid cells that play a role in many steps of the angiosperm reproductive process, including pollen tube guidance. At their micropylar poles, the synergid cells have a thickened and elaborated cell wall: the filiform apparatus that is thought to play a role in the secretion of the pollen tube attractant(s). MYB98 regulates an important subcircuit of the synergid gene regulatory network (GRN) that functions to activate the expression of genes required for pollen tube guidance and filiform apparatus formation. The MYB98 subcircuit comprises at least 83 downstream genes, including 48 genes within four gene families (CRP810, CRP3700, CRP3730 and CRP3740) that encode Cys-rich proteins. We show that the 11 CRP3700 genes, which include DD11 and DD18, are regulated by a common cis-element, GTAACNT, and that a multimer of this sequence confers MYB98-dependent synergid expression. The GTAACNT element contains the MYB98-binding site identified in vitro, suggesting that the 11 CRP3700 genes are direct targets of MYB98. We also show that five of the CRP810 genes, which include DD2, lack a functional GTAACNT element, suggesting that they are not directly regulated by MYB98. In addition, we show that the five CRP810 genes are regulated by the cis-element AACGT, and that a multimer of this sequence confers synergid expression. Together, these results suggest that the MYB98 branch of the synergid GRN is multi-tiered and, therefore, contains at least one additional downstream transcription factor.

  4. Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms

    Directory of Open Access Journals (Sweden)

    Tauch Andreas

    2009-01-01

    Full Text Available Abstract Background Transcriptional regulation of gene activity is essential for any living organism. Transcription factors therefore recognize specific binding sites within the DNA to regulate the expression of particular target genes. The genome-scale reconstruction of the emerging regulatory networks is important for biotechnology and human medicine but cost-intensive, time-consuming, and impossible to perform for any species separately. By using bioinformatics methods one can partially transfer networks from well-studied model organisms to closely related species. However, the prediction quality is limited by the low level of evolutionary conservation of the transcription factor binding sites, even within organisms of the same genus. Results Here we present an integrated bioinformatics workflow that assures the reliability of transferred gene regulatory networks. Our approach combines three methods that can be applied on a large-scale: re-assessment of annotated binding sites, subsequent binding site prediction, and homology detection. A gene regulatory interaction is considered to be conserved if (1 the transcription factor, (2 the adjusted binding site, and (3 the target gene are conserved. The power of the approach is demonstrated by transferring gene regulations from the model organism Corynebacterium glutamicum to the human pathogens C. diphtheriae, C. jeikeium, and the biotechnologically relevant C. efficiens. For these three organisms we identified reliable transcriptional regulations for ~40% of the common transcription factors, compared to ~5% for which knowledge was available before. Conclusion Our results suggest that trustworthy genome-scale transfer of gene regulatory networks between organisms is feasible in general but still limited by the level of evolutionary conservation.

  5. Protein Interaction Networks Reveal Novel Autism Risk Genes within GWAS Statistical Noise

    Science.gov (United States)

    Correia, Catarina; Oliveira, Guiomar; Vicente, Astrid M.

    2014-01-01

    Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical “noise” that warrant further analysis for causal variants. PMID:25409314

  6. Discovering missing reactions of metabolic networks by using gene co-expression data

    Science.gov (United States)

    Hosseini, Zhaleh; Marashi, Sayed-Amir

    2017-02-01

    Flux coupling analysis is a computational method which is able to explain co-expression of metabolic genes by analyzing the topological structure of a metabolic network. It has been suggested that if genes in two seemingly fully-coupled reactions are not highly co-expressed, then these two reactions are not fully coupled in reality, and hence, there is a gap or missing reaction in the network. Here, we present GAUGE as a novel approach for gap filling of metabolic networks, which is a two-step algorithm based on a mixed integer linear programming formulation. In GAUGE, the discrepancies between experimental co-expression data and predicted flux coupling relations is minimized by adding a minimum number of reactions to the network. We show that GAUGE is able to predict missing reactions of E. coli metabolism that are not detectable by other popular gap filling approaches. We propose that our algorithm may be used as a complementary strategy for the gap filling problem of metabolic networks. Since GAUGE relies only on gene expression data, it can be potentially useful for exploring missing reactions in the metabolism of non-model organisms, which are often poorly characterized, cannot grow in the laboratory, and lack genetic tools for generating knockouts.

  7. Modeling the effect of transcriptional noise on switching in gene networks in a genetic bistable switch.

    Science.gov (United States)

    Chaudhury, Srabanti

    2015-06-01

    Gene regulatory networks in cells allow transitions between gene expression states under the influence of both intrinsic and extrinsic noise. Here we introduce a new theoretical method to study the dynamics of switching in a two-state gene expression model with positive feedback by explicitly accounting for the transcriptional noise. Within this theoretical framework, we employ a semi-classical path integral technique to calculate the mean switching time starting from either an active or inactive promoter state. Our analytical predictions are in good agreement with Monte Carlo simulations and experimental observations.

  8. Insights gained from the reverse engineering of gene networks in keloid fibroblasts

    Directory of Open Access Journals (Sweden)

    Phan Toan

    2011-05-01

    Full Text Available Abstract Background Keloids are protrusive claw-like scars that have a propensity to recur even after surgery, and its molecular etiology remains elusive. The goal of reverse engineering is to infer gene networks from observational data, thus providing insight into the inner workings of a cell. However, most attempts at modeling biological networks have been done using simulated data. This study aims to highlight some of the issues involved in working with experimental data, and at the same time gain some insights into the transcriptional regulatory mechanism present in keloid fibroblasts. Methods Microarray data from our previous study was combined with microarray data obtained from the literature as well as new microarray data generated by our group. For the physical approach, we used the fREDUCE algorithm for correlating expression values to binding motifs. For the influence approach, we compared the Bayesian algorithm BANJO with the information theoretic method ARACNE in terms of performance in recovering known influence networks obtained from the KEGG database. In addition, we also compared the performance of different normalization methods as well as different types of gene networks. Results Using the physical approach, we found consensus sequences that were active in the keloid condition, as well as some sequences that were responsive to steroids, a commonly used treatment for keloids. From the influence approach, we found that BANJO was better at recovering the gene networks compared to ARACNE and that transcriptional networks were better suited for network recovery compared to cytokine-receptor interaction networks and intracellular signaling networks. We also found that the NFKB transcriptional network that was inferred from normal fibroblast data was more accurate compared to that inferred from keloid data, suggesting a more robust network in the keloid condition. Conclusions Consensus sequences that were found from this study are

  9. Regulatory Network Construction in Arabidopsis using genome-wide gene expression QTLs

    NARCIS (Netherlands)

    Keurentjes, J.J.B.; Fu, J.J.; Terpstra, I.R.; Garcia, J.M.; van den Ackerveken, G.; Snoek, L.B.; Peeters, A.J.M.; Vreugdenhil, D.; Koornreef, M.; Jansen, R.C.

    2007-01-01

    Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci.Keurentjes JJ, Fu J, Terpstra IR, Garcia JM, van den Ackerveken G, Snoek LB, Peeters AJ, Vreugdenhil D, Koornneef M, Jansen RC. Laboratory of Genetics, Wageningen University, Arboretumlaan 4,

  10. Bottom-up GGM algorithm for constructing multiple layered hierarchical gene regulatory networks

    Science.gov (United States)

    Multilayered hierarchical gene regulatory networks (ML-hGRNs) are very important for understanding genetics regulation of biological pathways. However, there are currently no computational algorithms available for directly building ML-hGRNs that regulate biological pathways. A bottom-up graphic Gaus...

  11. Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Jogendra Kushwah

    2013-06-01

    Full Text Available The free radical gene classification of cancer diseases is challenging job in biomedical data engineering. The improving of classification of gene selection of cancer diseases various classifier are used, but the classification of classifier are not validate. So ensemble classifier is used for cancer gene classification using neural network classifier with random forest tree. The random forest tree is ensembling technique of classifier in this technique the number of classifier ensemble of their leaf node of class of classifier. In this paper we combined neural network with random forest ensemble classifier for classification of cancer gene selection for diagnose analysis of cancer diseases. The proposed method is different from most of the methods of ensemble classifier, which follow an input output paradigm of neural network, where the members of the ensemble are selected from a set of neural network classifier. the number of classifiers is determined during the rising procedure of the forest. Furthermore, the proposed method produces an ensemble not only correct, but also assorted, ensuring the two important properties that should characterize an ensemble classifier. For empirical evaluation of our proposed method we used UCI cancer diseases data set for classification. Our experimental result shows that better result in compression of random forest tree classification.

  12. Gene Network Analysis and Functional Studies of Senescence-associated Genes Reveal Novel Regulators of Arabidopsis Leaf Senescence

    Institute of Scientific and Technical Information of China (English)

    Zhonghai Li; Jinying Peng; Xing Wen; Hongwei Guo

    2012-01-01

    Plant leaf senescence has been recognized as the last phase of plant development,a highly ordered process regulated by genes known as senescence associated genes (SAGs).However,the function of most of SAGs in regulating leaf senescence as well as regulators of those functionally known SAGs are still unclear.We have previously developed a curated database of genes potentially associated with leaf senescence,the Leaf Senescence Database (LSD).In this study,we built gene networks to identify common regulators of leaf senescence in Arabidopsis thaliana using promoting or delaying senescence genes in LSD.Our results demonstrated that plant hormones cytokinin,auxin,nitric oxide as well as small molecules,such as Ca2+,delay leaf senescence.By contrast,ethylene,ABA,SA and JA as well as small molecules,such as oxygen,promote leaf senescence,altogether supporting the idea that phytohormones play a critical role in regulating leaf senescence.Functional analysis of candidate SAGs in LSD revealed that a WRKY transcription factor WRKY75 and a Cys2/His2-type transcription factor AZF2 are positive regulators of leaf senescence and loss-of-function of WRKY75 or AZF2 delayed leaf senescence.We also found that silencing of a protein phosphatase,AtMKP2,promoted early senescence.Collectively,LSD can serve as a comprehensive resource for systematic study of the molecular mechanism of leaf senescence as well as offer candidate genes for functional analyses.

  13. A novel mutual information-based Boolean network inference method from time-series gene expression data

    Science.gov (United States)

    Barman, Shohag; Kwon, Yung-Keun

    2017-01-01

    Background Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately. Results In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI) method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods. Conclusions Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network. PMID:28178334

  14. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  15. Information theory in systems biology. Part I: Gene regulatory and metabolic networks.

    Science.gov (United States)

    Mousavian, Zaynab; Kavousi, Kaveh; Masoudi-Nejad, Ali

    2016-03-01

    "A Mathematical Theory of Communication", was published in 1948 by Claude Shannon to establish a framework that is now known as information theory. In recent decades, information theory has gained much attention in the area of systems biology. The aim of this paper is to provide a systematic review of those contributions that have applied information theory in inferring or understanding of biological systems. Based on the type of system components and the interactions between them, we classify the biological systems into 4 main classes: gene regulatory, metabolic, protein-protein interaction and signaling networks. In the first part of this review, we attempt to introduce most of the existing studies on two types of biological networks, including gene regulatory and metabolic networks, which are founded on the concepts of information theory.

  16. Computational design and designability of gene regulatory networks

    OpenAIRE

    Rodrigo Tarrega, Guillermo

    2012-01-01

    Nuestro conocimiento de las interacciones moleculares nos ha conducido hoy hacia una perspectiva ingenieril, donde diseños e implementaciones de sistemas artificiales de regulación intentan proporcionar instrucciones fundamentales para la reprogramación celular. Nosotros aquí abordamos el diseño de redes de genes como una forma de profundizar en la comprensión de las regulaciones naturales. También abordamos el problema de la diseñabilidad dada una genoteca de elementos compatibles. Con este ...

  17. Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

    Directory of Open Access Journals (Sweden)

    Chihyun Park

    Full Text Available BACKGROUND: The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. RESULTS: In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. CONCLUSIONS: The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

  18. Master regulators in development: Views from the Drosophila retinal determination and mammalian pluripotency gene networks.

    Science.gov (United States)

    Davis, Trevor L; Rebay, Ilaria

    2017-01-15

    Among the mechanisms that steer cells to their correct fate during development, master regulatory networks are unique in their sufficiency to trigger a developmental program outside of its normal context. In this review we discuss the key features that underlie master regulatory potency during normal and ectopic development, focusing on two examples, the retinal determination gene network (RDGN) that directs eye development in the fruit fly and the pluripotency gene network (PGN) that maintains cell fate competency in the early mammalian embryo. In addition to the hierarchical transcriptional activation, extensive positive transcriptional feedback, and cooperative protein-protein interactions that enable master regulators to override competing cellular programs, recent evidence suggests that network topology must also be dynamic, with extensive rewiring of the interactions and feedback loops required to navigate the correct sequence of developmental transitions to reach a final fate. By synthesizing the in vivo evidence provided by the RDGN with the extensive mechanistic insight gleaned from the PGN, we highlight the unique regulatory capabilities that continual reorganization into new hierarchies confers on master control networks. We suggest that deeper understanding of such dynamics should be a priority, as accurate spatiotemporal remodeling of network topology will undoubtedly be essential for successful stem cell based therapeutic efforts.

  19. The effects of a DTNBP1 gene variant on attention networks: an fMRI study

    Directory of Open Access Journals (Sweden)

    Thimm Markus

    2010-09-01

    Full Text Available Abstract Background Attention deficits belong to the main cognitive symptoms of schizophrenia and come along with altered neural activity in previously described cerebral networks. Given the high heritability of schizophrenia the question arises if impaired function of these networks is modulated by susceptibility genes and detectable in healthy risk allele carriers. Methods The present event-related fMRI study investigated the effect of the single nucleotide polymorphism (SNP rs1018381 of the DTNBP1 (dystrobrevin-binding protein 1 gene on brain activity in 80 subjects while performing the attention network test (ANT. In this reaction time task three domains of attention are probed simultaneously: alerting, orienting and executive control of attention. Results Risk allele carriers showed impaired performance in the executive control condition associated with reduced neural activity in the left superior frontal gyrus [Brodmann area (BA 9]. Risk allele carriers did not show alterations in the alerting and orienting networks. Conclusions BA 9 is a key region of schizophrenia pathology and belongs to a network that has been shown previously to be involved in impaired executive control mechanisms in schizophrenia. Our results identified the impact of DTNBP1 on the development of a specific attention deficit via modulation of a left prefrontal network.

  20. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes.

    NARCIS (Netherlands)

    Franke, L.; Bakel, H. van; Fokkens, L.; Jong, E.D. de; Egmont-Peterson, M.; Wijmenga, C.

    2006-01-01

    Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain

  1. Resolving candidate genes of mouse skeletal muscle QTL via RNA-Seq and expression network analyses

    Directory of Open Access Journals (Sweden)

    Lionikas Arimantas

    2012-11-01

    Full Text Available Abstract Background We have recently identified a number of Quantitative Trait Loci (QTL contributing to the 2-fold muscle weight difference between the LG/J and SM/J mouse strains and refined their confidence intervals. To facilitate nomination of the candidate genes responsible for these differences we examined the transcriptome of the tibialis anterior (TA muscle of each strain by RNA-Seq. Results 13,726 genes were expressed in mouse skeletal muscle. Intersection of a set of 1061 differentially expressed transcripts with a mouse muscle Bayesian Network identified a coherent set of differentially expressed genes that we term the LG/J and SM/J Regulatory Network (LSRN. The integration of the QTL, transcriptome and the network analyses identified eight key drivers of the LSRN (Kdr, Plbd1, Mgp, Fah, Prss23, 2310014F06Rik, Grtp1, Stk10 residing within five QTL regions, which were either polymorphic or differentially expressed between the two strains and are strong candidates for quantitative trait genes (QTGs underlying muscle mass. The insight gained from network analysis including the ability to make testable predictions is illustrated by annotating the LSRN with knowledge-based signatures and showing that the SM/J state of the network corresponds to a more oxidative state. We validated this prediction by NADH tetrazolium reductase staining in the TA muscle revealing higher oxidative potential of the SM/J compared to the LG/J strain (p Conclusion Thus, integration of fine resolution QTL mapping, RNA-Seq transcriptome information and mouse muscle Bayesian Network analysis provides a novel and unbiased strategy for nomination of muscle QTGs.

  2. Dynamics of simple gene-network motifs subject to extrinsic fluctuations

    Science.gov (United States)

    Roberts, Elijah; Be'er, Shay; Bohrer, Chris; Sharma, Rati; Assaf, Michael

    2015-12-01

    Cellular processes do not follow deterministic rules; even in identical environments genetically identical cells can make random choices leading to different phenotypes. This randomness originates from fluctuations present in the biomolecular interaction networks. Most previous work has been focused on the intrinsic noise (IN) of these networks. Yet, especially for high-copy-number biomolecules, extrinsic or environmental noise (EN) has been experimentally shown to dominate the variation. Here, we develop an analytical formalism that allows for calculation of the effect of EN on gene-expression motifs. We introduce a method for modeling bounded EN as an auxiliary species in the master equation. The method is fully generic and is not limited to systems with small EN magnitudes. We focus our study on motifs that can be viewed as the building blocks of genetic switches: a nonregulated gene, a self-inhibiting gene, and a self-promoting gene. The role of the EN properties (magnitude, correlation time, and distribution) on the statistics of interest are systematically investigated, and the effect of fluctuations in different reaction rates is compared. Due to its analytical nature, our formalism can be used to quantify the effect of EN on the dynamics of biochemical networks and can also be used to improve the interpretation of data from single-cell gene-expression experiments.

  3. Gene networks associated with conditional fear in mice identified using a systems genetics approach

    Directory of Open Access Journals (Sweden)

    Eskin Eleazar

    2011-03-01

    Full Text Available Abstract Background Our understanding of the genetic basis of learning and memory remains shrouded in mystery. To explore the genetic networks governing the biology of conditional fear, we used a systems genetics approach to analyze a hybrid mouse diversity panel (HMDP with high mapping resolution. Results A total of 27 behavioral quantitative trait loci were mapped with a false discovery rate of 5%. By integrating fear phenotypes, transcript profiling data from hippocampus and striatum and also genotype information, two gene co-expression networks correlated with context-dependent immobility were identified. We prioritized the key markers and genes in these pathways using intramodular connectivity measures and structural equation modeling. Highly connected genes in the context fear modules included Psmd6, Ube2a and Usp33, suggesting an important role for ubiquitination in learning and memory. In addition, we surveyed the architecture of brain transcript regulation and demonstrated preservation of gene co-expression modules in hippocampus and striatum, while also highlighting important differences. Rps15a, Kif3a, Stard7, 6330503K22RIK, and Plvap were among the individual genes whose transcript abundance were strongly associated with fear phenotypes. Conclusion Application of our multi-faceted mapping strategy permits an increasingly detailed characterization of the genetic networks underlying behavior.

  4. Mining the Medication Law of Ancient Analgesic Formulas Based on Complex Network%基于复杂网络挖掘古代止痛方剂用药规律

    Institute of Scientific and Technical Information of China (English)

    孟凡红; 李明; 李敬华; 牛亚华

    2013-01-01

    目的 通过复杂网络挖掘技术,总结古代止痛方剂的核心药物、配伍规律及用药特点,以期为疼痛的临床治疗及新药开发提供参考.方法 筛选汉代到金元时期的代表性方书著作14部,收集止痛方剂2 746首,建立中药止痛方剂数据库并进行术语规范;利用复方药物配伍的无尺度网络规律,构建止痛方剂复杂网络,分析止痛方剂的核心药物及配伍规律.结果 按疼痛部位分类,挖掘出腹痛、胸心痛、头痛、肢节痛、腰痛、胁痛、眼目痛、咽痛、全身痛、齿痛在汉唐、金宋元时期排名前10位的高频单味药和药对.结论 运用复杂网络挖掘技术,得到了汉唐、宋金元时期治疗各类痛证的核心药物、配伍药对以及用药特点,为今后进一步深入挖掘历代止痛方剂的用药配伍规律起到了示范作用.%Objective To summarize the core medicinal, composition law and medication characteristics of the ancient analgesic formulas through data mining in complex network and provide a reference for treating pain and new drug development. Methods Totally 2 746 formulas were selected f