WorldWideScience

Sample records for gene network reconstruction

  1. The Reconstruction and Analysis of Gene Regulatory Networks.

    Science.gov (United States)

    Zheng, Guangyong; Huang, Tao

    2018-01-01

    In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.

  2. Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Benjamin A Logsdon

    Full Text Available Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL, which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1, and genes involved in endocytosis (RCY1, the spindle checkpoint (BUB2, sulfonate catabolism (JLP1, and cell-cell communication (PRM7. Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data.

  3. Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Takeshi Hase

    Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

  4. A fast and efficient gene-network reconstruction method from multiple over-expression experiments

    Directory of Open Access Journals (Sweden)

    Thurner Stefan

    2009-08-01

    Full Text Available Abstract Background Reverse engineering of gene regulatory networks presents one of the big challenges in systems biology. Gene regulatory networks are usually inferred from a set of single-gene over-expressions and/or knockout experiments. Functional relationships between genes are retrieved either from the steady state gene expressions or from respective time series. Results We present a novel algorithm for gene network reconstruction on the basis of steady-state gene-chip data from over-expression experiments. The algorithm is based on a straight forward solution of a linear gene-dynamics equation, where experimental data is fed in as a first predictor for the solution. We compare the algorithm's performance with the NIR algorithm, both on the well known E. coli experimental data and on in-silico experiments. Conclusion We show superiority of the proposed algorithm in the number of correctly reconstructed links and discuss computational time and robustness. The proposed algorithm is not limited by combinatorial explosion problems and can be used in principle for large networks.

  5. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures.

    Science.gov (United States)

    Kentzoglanakis, Kyriakos; Poole, Matthew

    2012-01-01

    In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.

  6. Snapshot of iron response in Shewanella oneidensis by gene network reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Yunfeng; Harris, Daniel P.; Luo, Feng; Xiong, Wenlu; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin; Palumbo, Anthony V.; Arkin, Adam P.; Zhou, Jizhong

    2008-10-09

    Background: Iron homeostasis of Shewanella oneidensis, a gamma-proteobacterium possessing high iron content, is regulated by a global transcription factor Fur. However, knowledge is incomplete about other biological pathways that respond to changes in iron concentration, as well as details of the responses. In this work, we integrate physiological, transcriptomics and genetic approaches to delineate the iron response of S. oneidensis. Results: We show that the iron response in S. oneidensis is a rapid process. Temporal gene expression profiles were examined for iron depletion and repletion, and a gene co-expression network was reconstructed. Modules of iron acquisition systems, anaerobic energy metabolism and protein degradation were the most noteworthy in the gene network. Bioinformatics analyses suggested that genes in each of the modules might be regulated by DNA-binding proteins Fur, CRP and RpoH, respectively. Closer inspection of these modules revealed a transcriptional regulator (SO2426) involved in iron acquisition and ten transcriptional factors involved in anaerobic energy metabolism. Selected genes in the network were analyzed by genetic studies. Disruption of genes encoding a putative alcaligin biosynthesis protein (SO3032) and a gene previously implicated in protein degradation (SO2017) led to severe growth deficiency under iron depletion conditions. Disruption of a novel transcriptional factor (SO1415) caused deficiency in both anaerobic iron reduction and growth with thiosulfate or TMAO as an electronic acceptor, suggesting that SO1415 is required for specific branches of anaerobic energy metabolism pathways. Conclusions: Using a reconstructed gene network, we identified major biological pathways that were differentially expressed during iron depletion and repletion. Genetic studies not only demonstrated the importance of iron acquisition and protein degradation for iron depletion, but also characterized a novel transcriptional factor (SO1415) with a

  7. Recurrent neural network based hybrid model for reconstructing gene regulatory network.

    Science.gov (United States)

    Raza, Khalid; Alam, Mansaf

    2016-10-01

    One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Song, Mingzhou (Joe) [New Mexico State University, Las Cruces; Lewis, Chris K. [New Mexico State University, Las Cruces; Lance, Eric [New Mexico State University, Las Cruces; Chesler, Elissa J [ORNL; Kirova, Roumyana [Bristol-Myers Squibb Pharmaceutical Research & Development, NJ; Langston, Michael A [University of Tennessee, Knoxville (UTK); Bergeson, Susan [Texas Tech University, Lubbock

    2009-01-01

    The problem of reconstructing generalized logical networks to account for temporal dependencies among genes and environmental stimuli from high-throughput transcriptomic data is addressed. A network reconstruction algorithm was developed that uses the statistical significance as a criterion for network selection to avoid false-positive interactions arising from pure chance. Using temporal gene expression data collected from the brains of alcohol-treated mice in an analysis of the molecular response to alcohol, this algorithm identified genes from a major neuronal pathway as putative components of the alcohol response mechanism. Three of these genes have known associations with alcohol in the literature. Several other potentially relevant genes, highlighted and agreeing with independent results from literature mining, may play a role in the response to alcohol. Additional, previously-unknown gene interactions were discovered that, subject to biological verification, may offer new clues in the search for the elusive molecular mechanisms of alcoholism.

  9. Differential reconstructed gene interaction networks for deriving toxicity threshold in chemical risk assessment.

    Science.gov (United States)

    Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping

    2013-01-01

    Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach

  10. Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection.

    Science.gov (United States)

    Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne

    2005-04-15

    The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.

  11. Stability indicators in network reconstruction.

    Directory of Open Access Journals (Sweden)

    Michele Filosi

    Full Text Available The number of available algorithms to infer a biological network from a dataset of high-throughput measurements is overwhelming and keeps growing. However, evaluating their performance is unfeasible unless a 'gold standard' is available to measure how close the reconstructed network is to the ground truth. One measure of this is the stability of these predictions to data resampling approaches. We introduce NetSI, a family of Network Stability Indicators, to assess quantitatively the stability of a reconstructed network in terms of inference variability due to data subsampling. In order to evaluate network stability, the main NetSI methods use a global/local network metric in combination with a resampling (bootstrap or cross-validation procedure. In addition, we provide two normalized variability scores over data resampling to measure edge weight stability and node degree stability, and then introduce a stability ranking for edges and nodes. A complete implementation of the NetSI indicators, including the Hamming-Ipsen-Mikhailov (HIM network distance adopted in this paper is available with the R package nettools. We demonstrate the use of the NetSI family by measuring network stability on four datasets against alternative network reconstruction methods. First, the effect of sample size on stability of inferred networks is studied in a gold standard framework on yeast-like data from the Gene Net Weaver simulator. We also consider the impact of varying modularity on a set of structurally different networks (50 nodes, from 2 to 10 modules, and then of complex feature covariance structure, showing the different behaviours of standard reconstruction methods based on Pearson correlation, Maximum Information Coefficient (MIC and False Discovery Rate (FDR strategy. Finally, we demonstrate a strong combined effect of different reconstruction methods and phenotype subgroups on a hepatocellular carcinoma miRNA microarray dataset (240 subjects, and we

  12. Quartet-based methods to reconstruct phylogenetic networks.

    Science.gov (United States)

    Yang, Jialiang; GrĂĽnewald, Stefan; Xu, Yifei; Wan, Xiu-Feng

    2014-02-20

    Phylogenetic networks are employed to visualize evolutionary relationships among a group of nucleotide sequences, genes or species when reticulate events like hybridization, recombination, reassortant and horizontal gene transfer are believed to be involved. In comparison to traditional distance-based methods, quartet-based methods consider more information in the reconstruction process and thus have the potential to be more accurate. We introduce QuartetSuite, which includes a set of new quartet-based methods, namely QuartetS, QuartetA, and QuartetM, to reconstruct phylogenetic networks from nucleotide sequences. We tested their performances and compared them with other popular methods on two simulated nucleotide sequence data sets: one generated from a tree topology and the other from a complicated evolutionary history containing three reticulate events. We further validated these methods to two real data sets: a bacterial data set consisting of seven concatenated genes of 36 bacterial species and an influenza data set related to recently emerging H7N9 low pathogenic avian influenza viruses in China. QuartetS, QuartetA, and QuartetM have the potential to accurately reconstruct evolutionary scenarios from simple branching trees to complicated networks containing many reticulate events. These methods could provide insights into the understanding of complicated biological evolutionary processes such as bacterial taxonomy and reassortant of influenza viruses.

  13. Reconstructing Genetic Regulatory Networks Using Two-Step Algorithms with the Differential Equation Models of Neural Networks.

    Science.gov (United States)

    Chen, Chi-Kan

    2017-07-26

    The identification of genetic regulatory networks (GRNs) provides insights into complex cellular processes. A class of recurrent neural networks (RNNs) captures the dynamics of GRN. Algorithms combining the RNN and machine learning schemes were proposed to reconstruct small-scale GRNs using gene expression time series. We present new GRN reconstruction methods with neural networks. The RNN is extended to a class of recurrent multilayer perceptrons (RMLPs) with latent nodes. Our methods contain two steps: the edge rank assignment step and the network construction step. The former assigns ranks to all possible edges by a recursive procedure based on the estimated weights of wires of RNN/RMLP (RE RNN /RE RMLP ), and the latter constructs a network consisting of top-ranked edges under which the optimized RNN simulates the gene expression time series. The particle swarm optimization (PSO) is applied to optimize the parameters of RNNs and RMLPs in a two-step algorithm. The proposed RE RNN -RNN and RE RMLP -RNN algorithms are tested on synthetic and experimental gene expression time series of small GRNs of about 10 genes. The experimental time series are from the studies of yeast cell cycle regulated genes and E. coli DNA repair genes. The unstable estimation of RNN using experimental time series having limited data points can lead to fairly arbitrary predicted GRNs. Our methods incorporate RNN and RMLP into a two-step structure learning procedure. Results show that the RE RMLP using the RMLP with a suitable number of latent nodes to reduce the parameter dimension often result in more accurate edge ranks than the RE RNN using the regularized RNN on short simulated time series. Combining by a weighted majority voting rule the networks derived by the RE RMLP -RNN using different numbers of latent nodes in step one to infer the GRN, the method performs consistently and outperforms published algorithms for GRN reconstruction on most benchmark time series. The framework of two

  14. Reconstructing phylogenetic networks using maximum parsimony.

    Science.gov (United States)

    Nakhleh, Luay; Jin, Guohua; Zhao, Fengmei; Mellor-Crummey, John

    2005-01-01

    Phylogenies - the evolutionary histories of groups of organisms - are one of the most widely used tools throughout the life sciences, as well as objects of research within systematics, evolutionary biology, epidemiology, etc. Almost every tool devised to date to reconstruct phylogenies produces trees; yet it is widely understood and accepted that trees oversimplify the evolutionary histories of many groups of organims, most prominently bacteria (because of horizontal gene transfer) and plants (because of hybrid speciation). Various methods and criteria have been introduced for phylogenetic tree reconstruction. Parsimony is one of the most widely used and studied criteria, and various accurate and efficient heuristics for reconstructing trees based on parsimony have been devised. Jotun Hein suggested a straightforward extension of the parsimony criterion to phylogenetic networks. In this paper we formalize this concept, and provide the first experimental study of the quality of parsimony as a criterion for constructing and evaluating phylogenetic networks. Our results show that, when extended to phylogenetic networks, the parsimony criterion produces promising results. In a great majority of the cases in our experiments, the parsimony criterion accurately predicts the numbers and placements of non-tree events.

  15. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Science.gov (United States)

    Pardi, Fabio; Scornavacca, Celine

    2015-04-01

    Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  16. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    Directory of Open Access Journals (Sweden)

    Fabio Pardi

    2015-04-01

    Full Text Available Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks.

  17. Cross disease analysis of co-functional microRNA pairs on a reconstructed network of disease-gene-microRNA tripartite.

    Science.gov (United States)

    Peng, Hui; Lan, Chaowang; Zheng, Yi; Hutvagner, Gyorgy; Tao, Dacheng; Li, Jinyan

    2017-03-24

    MicroRNAs always function cooperatively in their regulation of gene expression. Dysfunctions of these co-functional microRNAs can play significant roles in disease development. We are interested in those multi-disease associated co-functional microRNAs that regulate their common dysfunctional target genes cooperatively in the development of multiple diseases. The research is potentially useful for human disease studies at the transcriptional level and for the study of multi-purpose microRNA therapeutics. We designed a computational method to detect multi-disease associated co-functional microRNA pairs and conducted cross disease analysis on a reconstructed disease-gene-microRNA (DGR) tripartite network. The construction of the DGR tripartite network is by the integration of newly predicted disease-microRNA associations with those relationships of diseases, microRNAs and genes maintained by existing databases. The prediction method uses a set of reliable negative samples of disease-microRNA association and a pre-computed kernel matrix instead of kernel functions. From this reconstructed DGR tripartite network, multi-disease associated co-functional microRNA pairs are detected together with their common dysfunctional target genes and ranked by a novel scoring method. We also conducted proof-of-concept case studies on cancer-related co-functional microRNA pairs as well as on non-cancer disease-related microRNA pairs. With the prioritization of the co-functional microRNAs that relate to a series of diseases, we found that the co-function phenomenon is not unusual. We also confirmed that the regulation of the microRNAs for the development of cancers is more complex and have more unique properties than those of non-cancer diseases.

  18. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network.

    Science.gov (United States)

    Kordmahalleh, Mina Moradi; Sefidmazgi, Mohammad Gorji; Harrison, Scott H; Homaifar, Abdollah

    2017-01-01

    The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network

  19. An Integrative Bioinformatics Framework for Genome-scale Multiple Level Network Reconstruction of Rice

    Directory of Open Access Journals (Sweden)

    Liu Lili

    2013-06-01

    Full Text Available Understanding how metabolic reactions translate the genome of an organism into its phenotype is a grand challenge in biology. Genome-wide association studies (GWAS statistically connect genotypes to phenotypes, without any recourse to known molecular interactions, whereas a molecular mechanistic description ties gene function to phenotype through gene regulatory networks (GRNs, protein-protein interactions (PPIs and molecular pathways. Integration of different regulatory information levels of an organism is expected to provide a good way for mapping genotypes to phenotypes. However, the lack of curated metabolic model of rice is blocking the exploration of genome-scale multi-level network reconstruction. Here, we have merged GRNs, PPIs and genome-scale metabolic networks (GSMNs approaches into a single framework for rice via omics’ regulatory information reconstruction and integration. Firstly, we reconstructed a genome-scale metabolic model, containing 4,462 function genes, 2,986 metabolites involved in 3,316 reactions, and compartmentalized into ten subcellular locations. Furthermore, 90,358 pairs of protein-protein interactions, 662,936 pairs of gene regulations and 1,763 microRNA-target interactions were integrated into the metabolic model. Eventually, a database was developped for systematically storing and retrieving the genome-scale multi-level network of rice. This provides a reference for understanding genotype-phenotype relationship of rice, and for analysis of its molecular regulatory network.

  20. Learning gene regulatory networks from only positive and unlabeled data

    Directory of Open Access Journals (Sweden)

    Elkan Charles

    2010-05-01

    Full Text Available Abstract Background Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact. Results A recent advance in research on data mining is a method capable of learning a classifier from only positive and unlabeled examples, that does not need labeled negative examples. Applied to the reconstruction of gene regulatory networks, we show that this method significantly outperforms the current state of the art of machine learning methods. We assess the new method using both simulated and experimental data, and obtain major performance improvement. Conclusions Compared to unsupervised methods for gene network inference, supervised methods are potentially more accurate, but for training they need a complete set of known regulatory connections. A supervised method that can be trained using only positive and unlabeled data, as presented in this paper, is especially beneficial for the task of inferring gene regulatory networks, because only an incomplete set of known regulatory connections is available in public databases such as RegulonDB, TRRD, KEGG, Transfac, and IPA.

  1. The transcriptional and gene regulatory network of Lactococcus lactis MG1363 during growth in milk.

    Directory of Open Access Journals (Sweden)

    Anne de Jong

    Full Text Available In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks.

  2. Reconstruction and validation of RefRec: a global model for the yeast molecular interaction network.

    Directory of Open Access Journals (Sweden)

    Tommi Aho

    2010-05-01

    Full Text Available Molecular interaction networks establish all cell biological processes. The networks are under intensive research that is facilitated by new high-throughput measurement techniques for the detection, quantification, and characterization of molecules and their physical interactions. For the common model organism yeast Saccharomyces cerevisiae, public databases store a significant part of the accumulated information and, on the way to better understanding of the cellular processes, there is a need to integrate this information into a consistent reconstruction of the molecular interaction network. This work presents and validates RefRec, the most comprehensive molecular interaction network reconstruction currently available for yeast. The reconstruction integrates protein synthesis pathways, a metabolic network, and a protein-protein interaction network from major biological databases. The core of the reconstruction is based on a reference object approach in which genes, transcripts, and proteins are identified using their primary sequences. This enables their unambiguous identification and non-redundant integration. The obtained total number of different molecular species and their connecting interactions is approximately 67,000. In order to demonstrate the capacity of RefRec for functional predictions, it was used for simulating the gene knockout damage propagation in the molecular interaction network in approximately 590,000 experimentally validated mutant strains. Based on the simulation results, a statistical classifier was subsequently able to correctly predict the viability of most of the strains. The results also showed that the usage of different types of molecular species in the reconstruction is important for accurate phenotype prediction. In general, the findings demonstrate the benefits of global reconstructions of molecular interaction networks. With all the molecular species and their physical interactions explicitly modeled, our

  3. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    Directory of Open Access Journals (Sweden)

    Kovaleva Galina

    2011-06-01

    Full Text Available Abstract Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR, numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp. Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S

  4. Quartet-net: a quartet-based method to reconstruct phylogenetic networks.

    Science.gov (United States)

    Yang, Jialiang; GrĂĽnewald, Stefan; Wan, Xiu-Feng

    2013-05-01

    Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular character-based methods are computationally inefficient, whereas distance-based methods cannot guarantee reconstruction accuracy because pairwise genetic distances only reflect partial information about a reticulate phylogeny. To balance accuracy and computational efficiency, here we introduce a quartet-based method to construct a phylogenetic network from a multiple sequence alignment. Unlike distances that only reflect the relationship between a pair of taxa, quartets contain information on the relationships among four taxa; these quartets provide adequate capacity to infer a more accurate phylogenetic network. In applications to simulated and biological data sets, we demonstrate that this novel method is robust and effective in reconstructing reticulate evolutionary events and it has the potential to infer more accurate phylogenetic distances than other conventional phylogenetic network construction methods such as Neighbor-Joining, Neighbor-Net, and Split Decomposition. This method can be used in constructing phylogenetic networks from simple evolutionary events involving a few reticulate events to complex evolutionary histories involving a large number of reticulate events. A software called "Quartet-Net" is implemented and available at http://sysbio.cvm.msstate.edu/QuartetNet/.

  5. Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems.

    Science.gov (United States)

    Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M

    2017-01-01

    Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

  6. Constructing an integrated gene similarity network for the identification of disease genes.

    Science.gov (United States)

    Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

    2017-09-20

    Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .

  7. Craniofacial Reconstruction Evaluation by Geodesic Network

    OpenAIRE

    Zhao, Junli; Liu, Cuiting; Wu, Zhongke; Duan, Fuqing; Wang, Kang; Jia, Taorui; Liu, Quansheng

    2014-01-01

    Craniofacial reconstruction is to estimate an individual’s face model from its skull. It has a widespread application in forensic medicine, archeology, medical cosmetic surgery, and so forth. However, little attention is paid to the evaluation of craniofacial reconstruction. This paper proposes an objective method to evaluate globally and locally the reconstructed craniofacial faces based on the geodesic network. Firstly, the geodesic networks of the reconstructed craniofacial face and the or...

  8. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

    Directory of Open Access Journals (Sweden)

    Fei Xiao

    Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.

  9. Efficient parsimony-based methods for phylogenetic network reconstruction.

    Science.gov (United States)

    Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir

    2007-01-15

    Phylogenies--the evolutionary histories of groups of organisms-play a major role in representing relationships among biological entities. Although many biological processes can be effectively modeled as tree-like relationships, others, such as hybrid speciation and horizontal gene transfer (HGT), result in networks, rather than trees, of relationships. Hybrid speciation is a significant evolutionary mechanism in plants, fish and other groups of species. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Maximum parsimony is one of the most commonly used criteria for phylogenetic tree inference. Roughly speaking, inference based on this criterion seeks the tree that minimizes the amount of evolution. In 1990, Jotun Hein proposed using this criterion for inferring the evolution of sequences subject to recombination. Preliminary results on small synthetic datasets. Nakhleh et al. (2005) demonstrated the criterion's application to phylogenetic network reconstruction in general and HGT detection in particular. However, the naive algorithms used by the authors are inapplicable to large datasets due to their demanding computational requirements. Further, no rigorous theoretical analysis of computing the criterion was given, nor was it tested on biological data. In the present work we prove that the problem of scoring the parsimony of a phylogenetic network is NP-hard and provide an improved fixed parameter tractable algorithm for it. Further, we devise efficient heuristics for parsimony-based reconstruction of phylogenetic networks. We test our methods on both synthetic and biological data (rbcL gene in bacteria) and obtain very promising results.

  10. Craniofacial Reconstruction Evaluation by Geodesic Network

    Directory of Open Access Journals (Sweden)

    Junli Zhao

    2014-01-01

    Full Text Available Craniofacial reconstruction is to estimate an individual’s face model from its skull. It has a widespread application in forensic medicine, archeology, medical cosmetic surgery, and so forth. However, little attention is paid to the evaluation of craniofacial reconstruction. This paper proposes an objective method to evaluate globally and locally the reconstructed craniofacial faces based on the geodesic network. Firstly, the geodesic networks of the reconstructed craniofacial face and the original face are built, respectively, by geodesics and isogeodesics, whose intersections are network vertices. Then, the absolute value of the correlation coefficient of the features of all corresponding geodesic network vertices between two models is taken as the holistic similarity, where the weighted average of the shape index values in a neighborhood is defined as the feature of each network vertex. Moreover, the geodesic network vertices of each model are divided into six subareas, that is, forehead, eyes, nose, mouth, cheeks, and chin, and the local similarity is measured for each subarea. Experiments using 100 pairs of reconstructed craniofacial faces and their corresponding original faces show that the evaluation by our method is roughly consistent with the subjective evaluation derived from thirty-five persons in five groups.

  11. Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

    Directory of Open Access Journals (Sweden)

    Faridah Hani Mohamed Salleh

    2017-01-01

    Full Text Available Gene regulatory network (GRN reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C as a direct interaction (A → C. Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

  12. Sequence-based model of gap gene regulatory network.

    Science.gov (United States)

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  13. MINER: exploratory analysis of gene interaction networks by machine learning from expression data

    Directory of Open Access Journals (Sweden)

    Sivieng Jane

    2009-12-01

    Full Text Available Abstract Background The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. Results We have developed MINER (Microarray Interactive Network Exploration and Representation, an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Conclusion Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.

  14. A reconstruction problem for a class of phylogenetic networks with lateral gene transfers.

    Science.gov (United States)

    Cardona, Gabriel; Pons, Joan Carles; RossellĂł, Francesc

    2015-01-01

    Lateral, or Horizontal, Gene Transfers are a type of asymmetric evolutionary events where genetic material is transferred from one species to another. In this paper we consider LGT networks, a general model of phylogenetic networks with lateral gene transfers which consist, roughly, of a principal rooted tree with its leaves labelled on a set of taxa, and a set of extra secondary arcs between nodes in this tree representing lateral gene transfers. An LGT network gives rise in a natural way to a principal phylogenetic subtree and a set of secondary phylogenetic subtrees, which, roughly, represent, respectively, the main line of evolution of most genes and the secondary lines of evolution through lateral gene transfers. We introduce a set of simple conditions on an LGT network that guarantee that its principal and secondary phylogenetic subtrees are pairwise different and that these subtrees determine, up to isomorphism, the LGT network. We then give an algorithm that, given a set of pairwise different phylogenetic trees [Formula: see text] on the same set of taxa, outputs, when it exists, the LGT network that satisfies these conditions and such that its principal phylogenetic tree is [Formula: see text] and its secondary phylogenetic trees are [Formula: see text].

  15. Tomographic image reconstruction using Artificial Neural Networks

    International Nuclear Information System (INIS)

    Paschalis, P.; Giokaris, N.D.; Karabarbounis, A.; Loudos, G.K.; Maintas, D.; Papanicolas, C.N.; Spanoudaki, V.; Tsoumpas, Ch.; Stiliaris, E.

    2004-01-01

    A new image reconstruction technique based on the usage of an Artificial Neural Network (ANN) is presented. The most crucial factor in designing such a reconstruction system is the network architecture and the number of the input projections needed to reconstruct the image. Although the training phase requires a large amount of input samples and a considerable CPU time, the trained network is characterized by simplicity and quick response. The performance of this ANN is tested using several image patterns. It is intended to be used together with a phantom rotating table and the Îł-camera of IASA for SPECT image reconstruction

  16. Inference of cancer-specific gene regulatory networks using soft computing rules.

    Science.gov (United States)

    Wang, Xiaosheng; Gotoh, Osamu

    2010-03-24

    Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  17. Crowdsourcing the nodulation gene network discovery environment.

    Science.gov (United States)

    Li, Yupeng; Jackson, Scott A

    2016-05-26

    The Legumes (Fabaceae) are an economically and ecologically important group of plant species with the conspicuous capacity for symbiotic nitrogen fixation in root nodules, specialized plant organs containing symbiotic microbes. With the aim of understanding the underlying molecular mechanisms leading to nodulation, many efforts are underway to identify nodulation-related genes and determine how these genes interact with each other. In order to accurately and efficiently reconstruct nodulation gene network, a crowdsourcing platform, CrowdNodNet, was created. The platform implements the jQuery and vis.js JavaScript libraries, so that users are able to interactively visualize and edit the gene network, and easily access the information about the network, e.g. gene lists, gene interactions and gene functional annotations. In addition, all the gene information is written on MediaWiki pages, enabling users to edit and contribute to the network curation. Utilizing the continuously updated, collaboratively written, and community-reviewed Wikipedia model, the platform could, in a short time, become a comprehensive knowledge base of nodulation-related pathways. The platform could also be used for other biological processes, and thus has great potential for integrating and advancing our understanding of the functional genomics and systems biology of any process for any species. The platform is available at http://crowd.bioops.info/ , and the source code can be openly accessed at https://github.com/bioops/crowdnodnet under MIT License.

  18. Novel candidate genes important for asthma and hypertension comorbidity revealed from associative gene networks.

    Science.gov (United States)

    Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A

    2018-02-13

    Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in

  19. Reconstruction of network topology using status-time-series data

    Science.gov (United States)

    Pandey, Pradumn Kumar; Badarla, Venkataramana

    2018-01-01

    Uncovering the heterogeneous connection pattern of a networked system from the available status-time-series (STS) data of a dynamical process on the network is of great interest in network science and known as a reverse engineering problem. Dynamical processes on a network are affected by the structure of the network. The dependency between the diffusion dynamics and structure of the network can be utilized to retrieve the connection pattern from the diffusion data. Information of the network structure can help to devise the control of dynamics on the network. In this paper, we consider the problem of network reconstruction from the available status-time-series (STS) data using matrix analysis. The proposed method of network reconstruction from the STS data is tested successfully under susceptible-infected-susceptible (SIS) diffusion dynamics on real-world and computer-generated benchmark networks. High accuracy and efficiency of the proposed reconstruction procedure from the status-time-series data define the novelty of the method. Our proposed method outperforms compressed sensing theory (CST) based method of network reconstruction using STS data. Further, the same procedure of network reconstruction is applied to the weighted networks. The ordering of the edges in the weighted networks is identified with high accuracy.

  20. Inference of Cancer-specific Gene Regulatory Networks Using Soft Computing Rules

    Directory of Open Access Journals (Sweden)

    Xiaosheng Wang

    2010-03-01

    Full Text Available Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  1. Gap-filling analysis of the iJO1366 Escherichia coli metabolic network reconstruction for discovery of metabolic functions

    Directory of Open Access Journals (Sweden)

    Orth Jeffrey D

    2012-05-01

    Full Text Available Abstract Background The iJO1366 reconstruction of the metabolic network of Escherichia coli is one of the most complete and accurate metabolic reconstructions available for any organism. Still, because our knowledge of even well-studied model organisms such as this one is incomplete, this network reconstruction contains gaps and possible errors. There are a total of 208 blocked metabolites in iJO1366, representing gaps in the network. Results A new model improvement workflow was developed to compare model based phenotypic predictions to experimental data to fill gaps and correct errors. A Keio Collection based dataset of E. coli gene essentiality was obtained from literature data and compared to model predictions. The SMILEY algorithm was then used to predict the most likely missing reactions in the reconstructed network, adding reactions from a KEGG based universal set of metabolic reactions. The feasibility of these putative reactions was determined by comparing updated versions of the model to the experimental dataset, and genes were predicted for the most feasible reactions. Conclusions Numerous improvements to the iJO1366 metabolic reconstruction were suggested by these analyses. Experiments were performed to verify several computational predictions, including a new mechanism for growth on myo-inositol. The other predictions made in this study should be experimentally verifiable by similar means. Validating all of the predictions made here represents a substantial but important undertaking.

  2. SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data.

    Science.gov (United States)

    Woodhouse, Steven; Piterman, Nir; Wintersteiger, Christoph M; Göttgens, Berthold; Fisher, Jasmin

    2018-05-25

    Reconstruction of executable mechanistic models from single-cell gene expression data represents a powerful approach to understanding developmental and disease processes. New ambitious efforts like the Human Cell Atlas will soon lead to an explosion of data with potential for uncovering and understanding the regulatory networks which underlie the behaviour of all human cells. In order to take advantage of this data, however, there is a need for general-purpose, user-friendly and efficient computational tools that can be readily used by biologists who do not have specialist computer science knowledge. The Single Cell Network Synthesis toolkit (SCNS) is a general-purpose computational tool for the reconstruction and analysis of executable models from single-cell gene expression data. Through a graphical user interface, SCNS takes single-cell qPCR or RNA-sequencing data taken across a time course, and searches for logical rules that drive transitions from early cell states towards late cell states. Because the resulting reconstructed models are executable, they can be used to make predictions about the effect of specific gene perturbations on the generation of specific lineages. SCNS should be of broad interest to the growing number of researchers working in single-cell genomics and will help further facilitate the generation of valuable mechanistic insights into developmental, homeostatic and disease processes.

  3. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Baumbach Jan

    2007-11-01

    Full Text Available Abstract Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user can be analyzed in the context of known

  4. Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks.

    Science.gov (United States)

    Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J

    2013-01-01

    We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.

  5. An approach for reduction of false predictions in reverse engineering of gene regulatory networks.

    Science.gov (United States)

    Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar

    2018-05-14

    A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives

  6. Reconstruction of networks from one-step data by matching positions

    Science.gov (United States)

    Wu, Jianshe; Dang, Ni; Jiao, Yang

    2018-05-01

    It is a challenge in estimating the topology of a network from short time series data. In this paper, matching positions is developed to reconstruct the topology of a network from only one-step data. We consider a general network model of coupled agents, in which the phase transformation of each node is determined by its neighbors. From the phase transformation information from one step to the next, the connections of the tail vertices are reconstructed firstly by the matching positions. Removing the already reconstructed vertices, and repeatedly reconstructing the connections of tail vertices, the topology of the entire network is reconstructed. For sparse scale-free networks with more than ten thousands nodes, we almost obtain the actual topology using only the one-step data in simulations.

  7. Reconstructing the Hopfield network as an inverse Ising problem

    International Nuclear Information System (INIS)

    Huang Haiping

    2010-01-01

    We test four fast mean-field-type algorithms on Hopfield networks as an inverse Ising problem. The equilibrium behavior of Hopfield networks is simulated through Glauber dynamics. In the low-temperature regime, the simulated annealing technique is adopted. Although performances of these network reconstruction algorithms on the simulated network of spiking neurons are extensively studied recently, the analysis of Hopfield networks is lacking so far. For the Hopfield network, we found that, in the retrieval phase favored when the network wants to memory one of stored patterns, all the reconstruction algorithms fail to extract interactions within a desired accuracy, and the same failure occurs in the spin-glass phase where spurious minima show up, while in the paramagnetic phase, albeit unfavored during the retrieval dynamics, the algorithms work well to reconstruct the network itself. This implies that, as an inverse problem, the paramagnetic phase is conversely useful for reconstructing the network while the retrieval phase loses all the information about interactions in the network except for the case where only one pattern is stored. The performances of algorithms are studied with respect to the system size, memory load, and temperature; sample-to-sample fluctuations are also considered.

  8. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Science.gov (United States)

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  9. Reconstruction of the gene regulatory network involved in the sonic hedgehog pathway with a potential role in early development of the mouse brain.

    Directory of Open Access Journals (Sweden)

    Jinhua Liu

    2014-10-01

    Full Text Available The Sonic hedgehog (Shh signaling pathway is crucial for pattern formation in early central nervous system development. By systematically analyzing high-throughput in situ hybridization data of E11.5 mouse brain, we found that Shh and its receptor Ptch1 define two adjacent mutually exclusive gene expression domains: Shh+Ptch1- and Shh-Ptch1+. These two domains are associated respectively with Foxa2 and Gata3, two transcription factors that play key roles in specifying them. Gata3 ChIP-seq experiments and RNA-seq assays on Gata3-knockdown cells revealed that Gata3 up-regulates the genes that are enriched in the Shh-Ptch1+ domain. Important Gata3 targets include Slit2 and Slit3, which are involved in the process of axon guidance, as well as Slc18a1, Th and Qdpr, which are associated with neurotransmitter synthesis and release. By contrast, Foxa2 both up-regulates the genes expressed in the Shh+Ptch1- domain and down-regulates the genes characteristic of the Shh-Ptch1+ domain. From these and other data, we were able to reconstruct a gene regulatory network governing both domains. Our work provides the first genome-wide characterization of the gene regulatory network involved in the Shh pathway that underlies pattern formation in the early mouse brain.

  10. Reconstruction and analysis of nutrient-induced phosphorylation networks in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Guangyou eDuan

    2013-12-01

    Full Text Available Elucidating the dynamics of molecular processes in living organisms in response to external perturbations is a central goal in modern systems biology. We investigated the dynamics of protein phosphorylation events in Arabidopsis thaliana exposed to changing nutrient conditions. Phosphopeptide expression levels were detected at five consecutive time points over a time interval of 30 minutes after nutrient resupply following prior starvation. The three tested inorganic, ionic nutrients NH4+, NO3-, PO43- elicited similar phosphosignaling responses that were distinguishable from those invoked by the sugars mannitol, sucrose. When embedded in the protein-protein interaction network of Arabidopsis thaliana, phosphoproteins were found to exhibit a higher degree compared to average proteins. Based on the time-series data, we reconstructed a network of regulatory interactions mediated by phosphorylation. The performance of different network inference methods was evaluated by the observed likelihood of physical interactions within and across different subcellular compartments and based on gene ontology semantic similarity. The dynamic phosphorylation network was then reconstructed using a Pearson correlation method with added directionality based on partial variance differences. The topology of the inferred integrated network corresponds to an information dissemination architecture, in which the phosphorylation signal is passed on to an increasing number of phosphoproteins stratified into an initiation, processing, and effector layer. Specific phosphorylation peptide motifs associated with the distinct layers were identified indicating the action of layer-specific kinases. Despite the limited temporal resolution, combined with information on subcellular location, the available time-series data proved useful for reconstructing the dynamics of the molecular signaling cascade in response to nutrient stress conditions in the plant Arabidopsis thaliana.

  11. Reconstruction of periodic signals using neural networks

    Directory of Open Access Journals (Sweden)

    José Danilo Rairán Antolines

    2014-01-01

    Full Text Available In this paper, we reconstruct a periodic signal by using two neural networks. The first network is trained to approximate the period of a signal, and the second network estimates the corresponding coefficients of the signal's Fourier expansion. The reconstruction strategy consists in minimizing the mean-square error via backpro-pagation algorithms over a single neuron with a sine transfer function. Additionally, this paper presents mathematical proof about the quality of the approximation as well as a first modification of the algorithm, which requires less data to reach the same estimation; thus making the algorithm suitable for real-time implementations.

  12. Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

    Directory of Open Access Journals (Sweden)

    Xiaobo Guo

    Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.

  13. Yeast 5 – an expanded reconstruction of the Saccharomyces cerevisiae metabolic network

    Directory of Open Access Journals (Sweden)

    Heavner Benjamin D

    2012-06-01

    Full Text Available Abstract Background Efforts to improve the computational reconstruction of the Saccharomyces cerevisiae biochemical reaction network and to refine the stoichiometrically constrained metabolic models that can be derived from such a reconstruction have continued since the first stoichiometrically constrained yeast genome scale metabolic model was published in 2003. Continuing this ongoing process, we have constructed an update to the Yeast Consensus Reconstruction, Yeast 5. The Yeast Consensus Reconstruction is a product of efforts to forge a community-based reconstruction emphasizing standards compliance and biochemical accuracy via evidence-based selection of reactions. It draws upon models published by a variety of independent research groups as well as information obtained from biochemical databases and primary literature. Results Yeast 5 refines the biochemical reactions included in the reconstruction, particularly reactions involved in sphingolipid metabolism; updates gene-reaction annotations; and emphasizes the distinction between reconstruction and stoichiometrically constrained model. Although it was not a primary goal, this update also improves the accuracy of model prediction of viability and auxotrophy phenotypes and increases the number of epistatic interactions. This update maintains an emphasis on standards compliance, unambiguous metabolite naming, and computer-readable annotations available through a structured document format. Additionally, we have developed MATLAB scripts to evaluate the model’s predictive accuracy and to demonstrate basic model applications such as simulating aerobic and anaerobic growth. These scripts, which provide an independent tool for evaluating the performance of various stoichiometrically constrained yeast metabolic models using flux balance analysis, are included as Additional files 1, 2 and 3. Additional file 1 Function testYeastModel.m.m. Click here for file Additional file 2 Function modelToReconstruction

  14. P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations.

    Science.gov (United States)

    Young-Rae Cho; Yanan Xin; Speegle, Greg

    2015-01-01

    Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.

  15. Reconstructing gene regulatory networks from knock-out data using Gaussian Noise Model and Pearson Correlation Coefficient.

    Science.gov (United States)

    Mohamed Salleh, Faridah Hani; Arif, Shereena Mohd; Zainudin, Suhaila; Firdaus-Raih, Mohd

    2015-12-01

    A gene regulatory network (GRN) is a large and complex network consisting of interacting elements that, over time, affect each other's state. The dynamics of complex gene regulatory processes are difficult to understand using intuitive approaches alone. To overcome this problem, we propose an algorithm for inferring the regulatory interactions from knock-out data using a Gaussian model combines with Pearson Correlation Coefficient (PCC). There are several problems relating to GRN construction that have been outlined in this paper. We demonstrated the ability of our proposed method to (1) predict the presence of regulatory interactions between genes, (2) their directionality and (3) their states (activation or suppression). The algorithm was applied to network sizes of 10 and 50 genes from DREAM3 datasets and network sizes of 10 from DREAM4 datasets. The predicted networks were evaluated based on AUROC and AUPR. We discovered that high false positive values were generated by our GRN prediction methods because the indirect regulations have been wrongly predicted as true relationships. We achieved satisfactory results as the majority of sub-networks achieved AUROC values above 0.5. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Reconstruction of neutron spectra through neural networks

    International Nuclear Information System (INIS)

    Vega C, H.R.; Hernandez D, V.M.; Manzanares A, E.

    2003-01-01

    A neural network has been used to reconstruct the neutron spectra starting from the counting rates of the detectors of the Bonner sphere spectrophotometric system. A group of 56 neutron spectra was selected to calculate the counting rates that would produce in a Bonner sphere system, with these data and the spectra it was trained the neural network. To prove the performance of the net, 12 spectra were used, 6 were taken of the group used for the training, 3 were obtained of mathematical functions and those other 3 correspond to real spectra. When comparing the original spectra of those reconstructed by the net we find that our net has a poor performance when reconstructing monoenergetic spectra, this attributes it to those characteristic of the spectra used for the training of the neural network, however for the other groups of spectra the results of the net are appropriate with the prospective ones. (Author)

  17. Dynamic Regulatory Network Reconstruction for Alzheimer’s Disease Based on Matrix Decomposition Techniques

    Directory of Open Access Journals (Sweden)

    Wei Kong

    2014-01-01

    Full Text Available Alzheimer’s disease (AD is the most common form of dementia and leads to irreversible neurodegenerative damage of the brain. Finding the dynamic responses of genes, signaling proteins, transcription factor (TF activities, and regulatory networks of the progressively deteriorative progress of AD would represent a significant advance in discovering the pathogenesis of AD. However, the high throughput technologies of measuring TF activities are not yet available on a genome-wide scale. In this study, based on DNA microarray gene expression data and a priori information of TFs, network component analysis (NCA algorithm is applied to determining the TF activities and regulatory influences on TGs of incipient, moderate, and severe AD. Based on that, the dynamical gene regulatory networks of the deteriorative courses of AD were reconstructed. To select significant genes which are differentially expressed in different courses of AD, independent component analysis (ICA, which is better than the traditional clustering methods and can successfully group one gene in different meaningful biological processes, was used. The molecular biological analysis showed that the changes of TF activities and interactions of signaling proteins in mitosis, cell cycle, immune response, and inflammation play an important role in the deterioration of AD.

  18. Enhanced reconstruction of weighted networks from strengths and degrees

    International Nuclear Information System (INIS)

    Mastrandrea, Rossana; Fagiolo, Giorgio; Squartini, Tiziano; Garlaschelli, Diego

    2014-01-01

    Network topology plays a key role in many phenomena, from the spreading of diseases to that of financial crises. Whenever the whole structure of a network is unknown, one must resort to reconstruction methods that identify the least biased ensemble of networks consistent with the partial information available. A challenging case, frequently encountered due to privacy issues in the analysis of interbank flows and Big Data, is when there is only local (node-specific) aggregate information available. For binary networks, the relevant ensemble is one where the degree (number of links) of each node is constrained to its observed value. However, for weighted networks the problem is much more complicated. While the naĂŻve approach prescribes to constrain the strengths (total link weights) of all nodes, recent counter-intuitive results suggest that in weighted networks the degrees are often more informative than the strengths. This implies that the reconstruction of weighted networks would be significantly enhanced by the specification of both strengths and degrees, a computationally hard and bias-prone procedure. Here we solve this problem by introducing an analytical and unbiased maximum-entropy method that works in the shortest possible time and does not require the explicit generation of reconstructed samples. We consider several real-world examples and show that, while the strengths alone give poor results, the additional knowledge of the degrees yields accurately reconstructed networks. Information-theoretic criteria rigorously confirm that the degree sequence, as soon as it is non-trivial, is irreducible to the strength sequence. Our results have strong implications for the analysis of motifs and communities and whenever the reconstructed ensemble is required as a null model to detect higher-order patterns

  19. Singular Perturbation Analysis and Gene Regulatory Networks with Delay

    Science.gov (United States)

    Shlykova, Irina; Ponosov, Arcady

    2009-09-01

    There are different ways of how to model gene regulatory networks. Differential equations allow for a detailed description of the network's dynamics and provide an explicit model of the gene concentration changes over time. Production and relative degradation rate functions used in such models depend on the vector of steeply sloped threshold functions which characterize the activity of genes. The most popular example of the threshold functions comes from the Boolean network approach, where the threshold functions are given by step functions. The system of differential equations becomes then piecewise linear. The dynamics of this system can be described very easily between the thresholds, but not in the switching domains. For instance this approach fails to analyze stationary points of the system and to define continuous solutions in the switching domains. These problems were studied in [2], [3], but the proposed model did not take into account a time delay in cellular systems. However, analysis of real gene expression data shows a considerable number of time-delayed interactions suggesting that time delay is essential in gene regulation. Therefore, delays may have a great effect on the dynamics of the system presenting one of the critical factors that should be considered in reconstruction of gene regulatory networks. The goal of this work is to apply the singular perturbation analysis to certain systems with delay and to obtain an analog of Tikhonov's theorem, which provides sufficient conditions for constracting the limit system in the delay case.

  20. On the Complexity of Reconstructing Chemical Reaction Networks

    DEFF Research Database (Denmark)

    Fagerberg, Rolf; Flamm, Christoph; Merkle, Daniel

    2013-01-01

    The analysis of the structure of chemical reaction networks is crucial for a better understanding of chemical processes. Such networks are well described as hypergraphs. However, due to the available methods, analyses regarding network properties are typically made on standard graphs derived from...... the full hypergraph description, e.g. on the so-called species and reaction graphs. However, a reconstruction of the underlying hypergraph from these graphs is not necessarily unique. In this paper, we address the problem of reconstructing a hypergraph from its species and reaction graph and show NP...

  1. ARACNe-AP: Gene Network Reverse Engineering through Adaptive Partitioning inference of Mutual Information. | Office of Cancer Genomics

    Science.gov (United States)

    The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space.

  2. Machine Learning-Assisted Network Inference Approach to Identify a New Class of Genes that Coordinate the Functionality of Cancer Networks.

    Science.gov (United States)

    Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu

    2017-08-01

    Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.

  3. Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data

    Directory of Open Access Journals (Sweden)

    Ai-Di Zhang

    2013-01-01

    Full Text Available With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases.

  4. Reconstruction of ribosomal RNA genes from metagenomic data.

    Directory of Open Access Journals (Sweden)

    Lu Fan

    Full Text Available Direct sequencing of environmental DNA (metagenomics has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.

  5. Neural Network for Sparse Reconstruction

    Directory of Open Access Journals (Sweden)

    Qingfa Li

    2014-01-01

    Full Text Available We construct a neural network based on smoothing approximation techniques and projected gradient method to solve a kind of sparse reconstruction problems. Neural network can be implemented by circuits and can be seen as an important method for solving optimization problems, especially large scale problems. Smoothing approximation is an efficient technique for solving nonsmooth optimization problems. We combine these two techniques to overcome the difficulties of the choices of the step size in discrete algorithms and the item in the set-valued map of differential inclusion. In theory, the proposed network can converge to the optimal solution set of the given problem. Furthermore, some numerical experiments show the effectiveness of the proposed network in this paper.

  6. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets

    NARCIS (Netherlands)

    Levering, J.; Fiedler, T.; Sieg, A.; van Grinsven, K.W.A.; Hering, S.; Veith, N.; Olivier, B.G.; Klett, L.; Hugenholtz, J.; Teusink, B.; Kreikemeyer, B.; Kummer, U.

    2016-01-01

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes

  7. Differential reconstructed gene interaction networks for deriving toxicity threshold in chemical risk assessment

    OpenAIRE

    Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping

    2013-01-01

    Background Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) appro...

  8. Reconstruction of certain phylogenetic networks from their tree-average distances.

    Science.gov (United States)

    Willson, Stephen J

    2013-10-01

    Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle.

  9. Hopfield neural network in HEP track reconstruction

    International Nuclear Information System (INIS)

    Muresan, R.; Pentia, M.

    1997-01-01

    In experimental particle physics, pattern recognition problems, specifically for neural network methods, occur frequently in track finding or feature extraction. Track finding is a combinatorial optimization problem. Given a set of points in Euclidean space, one tries the reconstruction of particle trajectories, subject to smoothness constraints.The basic ingredients in a neural network are the N binary neurons and the synaptic strengths connecting them. In our case the neurons are the segments connecting all possible point pairs.The dynamics of the neural network is given by a local updating rule wich evaluates for each neuron the sign of the 'upstream activity'. An updating rule in the form of sigmoid function is given. The synaptic strengths are defined in terms of angle between the segments and the lengths of the segments implied in the track reconstruction. An algorithm based on Hopfield neural network has been developed and tested on the track coordinates measured by silicon microstrip tracking system

  10. Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens: a transcriptomic model.

    Science.gov (United States)

    Omony, Jimmy; de Jong, Anne; Krawczyk, Antonina O; Eijlander, Robyn T; Kuipers, Oscar P

    2018-02-09

    Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes.

  11. Network Reconstruction of Dynamic Biological Systems

    OpenAIRE

    Asadi, Behrang

    2013-01-01

    Inference of network topology from experimental data is a central endeavor in biology, since knowledge of the underlying signaling mechanisms a requirement for understanding biological phenomena. As one of the most important tools in bioinformatics area, development of methods to reconstruct biological networks has attracted remarkable attention in the current decade. Integration of different data types can lead to remarkable improvements in our ability to identify the connectivity of differe...

  12. Mass reconstruction with a neural network

    International Nuclear Information System (INIS)

    Loennblad, L.; Peterson, C.; Roegnvaldsson, T.

    1992-01-01

    A feed-forward neural network method is developed for reconstructing the invariant mass of hadronic jets appearing in a calorimeter. The approach is illustrated in W→qanti q, where W-bosons are produced in panti p reactions at SPS collider energies. The neural network method yields results that are superior to conventional methods. This neural network application differs from the classification ones in the sense that an analog number (the mass) is computed by the network, rather than a binary decision being made. As a by-product our application clearly demonstrates the need for using 'intelligent' variables in instances when the amount of training instances is limited. (orig.)

  13. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  14. A practical algorithm for reconstructing level-1 phylogenetic networks

    NARCIS (Netherlands)

    Huber, K.T.; Iersel, van L.J.J.; Kelk, S.M.; Suchecki, R.

    2011-01-01

    Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks-a type of network

  15. Neural network algorithm for image reconstruction using the grid friendly projections

    International Nuclear Information System (INIS)

    Cierniak, R.

    2011-01-01

    Full text: The presented paper describes a development of original approach to the reconstruction problem using a recurrent neural network. Particularly, the 'grid-friendly' angles of performed projections are selected according to the discrete Radon transform (DRT) concept to decrease the number of projections required. The methodology of our approach is consistent with analytical reconstruction algorithms. Reconstruction problem is reformulated in our approach to optimization problem. This problem is solved in present concept using method based on the maximum likelihood methodology. The reconstruction algorithm proposed in this work is consequently adapted for more practical discrete fan beam projections. Computer simulation results show that the neural network reconstruction algorithm designed to work in this way improves obtained results and outperforms conventional methods in reconstructed image quality. (author)

  16. Discovering implicit entity relation with the gene-citation-gene network.

    Directory of Open Access Journals (Sweden)

    Min Song

    Full Text Available In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner.

  17. Reconstruction and in silico analysis of metabolic network for an oleaginous yeast, Yarrowia lipolytica.

    Directory of Open Access Journals (Sweden)

    Pengcheng Pan

    Full Text Available With the emergence of energy scarcity, the use of renewable energy sources such as biodiesel is becoming increasingly necessary. Recently, many researchers have focused their minds on Yarrowia lipolytica, a model oleaginous yeast, which can be employed to accumulate large amounts of lipids that could be further converted to biodiesel. In order to understand the metabolic characteristics of Y. lipolytica at a systems level and to examine the potential for enhanced lipid production, a genome-scale compartmentalized metabolic network was reconstructed based on a combination of genome annotation and the detailed biochemical knowledge from multiple databases such as KEGG, ENZYME and BIGG. The information about protein and reaction associations of all the organisms in KEGG and Expasy-ENZYME database was arranged into an EXCEL file that can then be regarded as a new useful database to generate other reconstructions. The generated model iYL619_PCP accounts for 619 genes, 843 metabolites and 1,142 reactions including 236 transport reactions, 125 exchange reactions and 13 spontaneous reactions. The in silico model successfully predicted the minimal media and the growing abilities on different substrates. With flux balance analysis, single gene knockouts were also simulated to predict the essential genes and partially essential genes. In addition, flux variability analysis was applied to design new mutant strains that will redirect fluxes through the network and may enhance the production of lipid. This genome-scale metabolic model of Y. lipolytica can facilitate system-level metabolic analysis as well as strain development for improving the production of biodiesels and other valuable products by Y. lipolytica and other closely related oleaginous yeasts.

  18. Strategy on energy saving reconstruction of distribution networks based on life cycle cost

    Science.gov (United States)

    Chen, Xiaofei; Qiu, Zejing; Xu, Zhaoyang; Xiao, Chupeng

    2017-08-01

    Because the actual distribution network reconstruction project funds are often limited, the cost-benefit model and the decision-making method are crucial for distribution network energy saving reconstruction project. From the perspective of life cycle cost (LCC), firstly the research life cycle is determined for the energy saving reconstruction of distribution networks with multi-devices. Then, a new life cycle cost-benefit model for energy-saving reconstruction of distribution network is developed, in which the modification schemes include distribution transformers replacement, lines replacement and reactive power compensation. In the operation loss cost and maintenance cost area, the operation cost model considering the influence of load season characteristics and the maintenance cost segmental model of transformers are proposed. Finally, aiming at the highest energy saving profit per LCC, a decision-making method is developed while considering financial and technical constraints as well. The model and method are applied to a real distribution network reconstruction, and the results prove that the model and method are effective.

  19. Iterative reconstruction of transcriptional regulatory networks: an algorithmic approach.

    Directory of Open Access Journals (Sweden)

    Christian L Barrett

    2006-05-01

    Full Text Available The number of complete, publicly available genome sequences is now greater than 200, and this number is expected to rapidly grow in the near future as metagenomic and environmental sequencing efforts escalate and the cost of sequencing drops. In order to make use of this data for understanding particular organisms and for discerning general principles about how organisms function, it will be necessary to reconstruct their various biochemical reaction networks. Principal among these will be transcriptional regulatory networks. Given the physical and logical complexity of these networks, the various sources of (often noisy data that can be utilized for their elucidation, the monetary costs involved, and the huge number of potential experiments approximately 10(12 that can be performed, experiment design algorithms will be necessary for synthesizing the various computational and experimental data to maximize the efficiency of regulatory network reconstruction. This paper presents an algorithm for experimental design to systematically and efficiently reconstruct transcriptional regulatory networks. It is meant to be applied iteratively in conjunction with an experimental laboratory component. The algorithm is presented here in the context of reconstructing transcriptional regulation for metabolism in Escherichia coli, and, through a retrospective analysis with previously performed experiments, we show that the produced experiment designs conform to how a human would design experiments. The algorithm is able to utilize probability estimates based on a wide range of computational and experimental sources to suggest experiments with the highest potential of discovering the greatest amount of new regulatory knowledge.

  20. Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data.

    Science.gov (United States)

    Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J

    2015-03-07

    Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP

  1. ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information.

    Science.gov (United States)

    Lachmann, Alexander; Giorgi, Federico M; Lopez, Gonzalo; Califano, Andrea

    2016-07-15

    The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space. Here, we present a completely new implementation of the algorithm, based on an Adaptive Partitioning strategy (AP) for estimating the Mutual Information. The new AP implementation (ARACNe-AP) achieves a dramatic improvement in computational performance (200× on average) over the previous methodology, while preserving the Mutual Information estimator and the Network inference accuracy of the original algorithm. Given that the previous version of ARACNe is extremely demanding, the new version of the algorithm will allow even researchers with modest computational resources to build complex regulatory networks from hundreds of gene expression profiles. A JAVA cross-platform command line executable of ARACNe, together with all source code and a detailed usage guide are freely available on Sourceforge (http://sourceforge.net/projects/aracne-ap). JAVA version 8 or higher is required. califano@c2b2.columbia.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  2. Construction and comparison of gene co-expression networks shows complex plant immune responses

    Directory of Open Access Journals (Sweden)

    Luis Guillermo Leal

    2014-10-01

    Full Text Available Gene co-expression networks (GCNs are graphic representations that depict the coordinated transcription of genes in response to certain stimuli. GCNs provide functional annotations of genes whose function is unknown and are further used in studies of translational functional genomics among species. In this work, a methodology for the reconstruction and comparison of GCNs is presented. This approach was applied using gene expression data that were obtained from immunity experiments in Arabidopsis thaliana, rice, soybean, tomato and cassava. After the evaluation of diverse similarity metrics for the GCN reconstruction, we recommended the mutual information coefficient measurement and a clustering coefficient-based method for similarity threshold selection. To compare GCNs, we proposed a multivariate approach based on the Principal Component Analysis (PCA. Branches of plant immunity that were exemplified by each experiment were analyzed in conjunction with the PCA results, suggesting both the robustness and the dynamic nature of the cellular responses. The dynamic of molecular plant responses produced networks with different characteristics that are differentiable using our methodology. The comparison of GCNs from plant pathosystems, showed that in response to similar pathogens plants could activate conserved signaling pathways. The results confirmed that the closeness of GCNs projected on the principal component space is an indicative of similarity among GCNs. This also can be used to understand global patterns of events triggered during plant immune responses.

  3. A neural network image reconstruction technique for electrical impedance tomography

    International Nuclear Information System (INIS)

    Adler, A.; Guardo, R.

    1994-01-01

    Reconstruction of Images in Electrical Impedance Tomography requires the solution of a nonlinear inverse problem on noisy data. This problem is typically ill-conditioned and requires either simplifying assumptions or regularization based on a priori knowledge. This paper presents a reconstruction algorithm using neural network techniques which calculates a linear approximation of the inverse problem directly from finite element simulations of the forward problem. This inverse is adapted to the geometry of the medium and the signal-to-noise ratio (SNR) used during network training. Results show good conductivity reconstruction where measurement SNR is similar to the training conditions. The advantages of this method are its conceptual simplicity and ease of implementation, and the ability to control the compromise between the noise performance and resolution of the image reconstruction

  4. Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets.

    Science.gov (United States)

    Levering, Jennifer; Fiedler, Tomas; Sieg, Antje; van Grinsven, Koen W A; Hering, Silvio; Veith, Nadine; Olivier, Brett G; Klett, Lara; Hugenholtz, Jeroen; Teusink, Bas; Kreikemeyer, Bernd; Kummer, Ursula

    2016-08-20

    Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49. Initially, we based the reconstruction on genome annotations and already existing and curated metabolic networks of Bacillus subtilis, Escherichia coli, Lactobacillus plantarum and Lactococcus lactis. This initial draft was manually curated with the final reconstruction accounting for 480 genes associated with 576 reactions and 558 metabolites. In order to constrain the model further, we performed growth experiments of wild type and arcA deletion strains of S. pyogenes M49 in a chemically defined medium and calculated nutrient uptake and production fluxes. We additionally performed amino acid auxotrophy experiments to test the consistency of the model. The established genome-scale model can be used to understand the growth requirements of the human pathogen S. pyogenes and define optimal and suboptimal conditions, but also to describe differences and similarities between S. pyogenes and related lactic acid bacteria such as L. lactis in order to find strategies to reduce the growth of the pathogen and propose drug targets. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggressive false positive control

    Directory of Open Access Journals (Sweden)

    Logsdon Benjamin A

    2012-04-01

    Full Text Available Abstract Background We propose a novel variational Bayes network reconstruction algorithm to extract the most relevant disease factors from high-throughput genomic data-sets. Our algorithm is the only scalable method for regularized network recovery that employs Bayesian model averaging and that can internally estimate an appropriate level of sparsity to ensure few false positives enter the model without the need for cross-validation or a model selection criterion. We use our algorithm to characterize the effect of genetic markers and liver gene expression traits on mouse obesity related phenotypes, including weight, cholesterol, glucose, and free fatty acid levels, in an experiment previously used for discovery and validation of network connections: an F2 intercross between the C57BL/6 J and C3H/HeJ mouse strains, where apolipoprotein E is null on the background. Results We identified eleven genes, Gch1, Zfp69, Dlgap1, Gna14, Yy1, Gabarapl1, Folr2, Fdft1, Cnr2, Slc24a3, and Ccl19, and a quantitative trait locus directly connected to weight, glucose, cholesterol, or free fatty acid levels in our network. None of these genes were identified by other network analyses of this mouse intercross data-set, but all have been previously associated with obesity or related pathologies in independent studies. In addition, through both simulations and data analysis we demonstrate that our algorithm achieves superior performance in terms of power and type I error control than other network recovery algorithms that use the lasso and have bounds on type I error control. Conclusions Our final network contains 118 previously associated and novel genes affecting weight, cholesterol, glucose, and free fatty acid levels that are excellent obesity risk candidates.

  6. A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks

    NARCIS (Netherlands)

    K.T. Huber; L.J.J. van Iersel (Leo); S.M. Kelk (Steven); R. Suchecki

    2010-01-01

    htmlabstractRecently much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks - a type of

  7. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    Science.gov (United States)

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  8. Reverse-engineering of gene networks for regulating early blood development from single-cell measurements.

    Science.gov (United States)

    Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai

    2017-12-28

    Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate

  9. Enhanced capital-asset pricing model for the reconstruction of bipartite financial networks

    Science.gov (United States)

    Squartini, Tiziano; Almog, Assaf; Caldarelli, Guido; van Lelyveld, Iman; Garlaschelli, Diego; Cimini, Giulio

    2017-09-01

    Reconstructing patterns of interconnections from partial information is one of the most important issues in the statistical physics of complex networks. A paramount example is provided by financial networks. In fact, the spreading and amplification of financial distress in capital markets are strongly affected by the interconnections among financial institutions. Yet, while the aggregate balance sheets of institutions are publicly disclosed, information on single positions is mostly confidential and, as such, unavailable. Standard approaches to reconstruct the network of financial interconnection produce unrealistically dense topologies, leading to a biased estimation of systemic risk. Moreover, reconstruction techniques are generally designed for monopartite networks of bilateral exposures between financial institutions, thus failing in reproducing bipartite networks of security holdings (e.g., investment portfolios). Here we propose a reconstruction method based on constrained entropy maximization, tailored for bipartite financial networks. Such a procedure enhances the traditional capital-asset pricing model (CAPM) and allows us to reproduce the correct topology of the network. We test this enhanced CAPM (ECAPM) method on a dataset, collected by the European Central Bank, of detailed security holdings of European institutional sectors over a period of six years (2009-2015). Our approach outperforms the traditional CAPM and the recently proposed maximum-entropy CAPM both in reproducing the network topology and in estimating systemic risk due to fire sales spillovers. In general, ECAPM can be applied to the whole class of weighted bipartite networks described by the fitness model.

  10. Identifying Tmem59 related gene regulatory network of mouse neural stem cell from a compendium of expression profiles

    Directory of Open Access Journals (Sweden)

    Guo Xiuyun

    2011-09-01

    Full Text Available Abstract Background Neural stem cells offer potential treatment for neurodegenerative disorders, such like Alzheimer's disease (AD. While much progress has been made in understanding neural stem cell function, a precise description of the molecular mechanisms regulating neural stem cells is not yet established. This lack of knowledge is a major barrier holding back the discovery of therapeutic uses of neural stem cells. In this paper, the regulatory mechanism of mouse neural stem cell (NSC differentiation by tmem59 is explored on the genome-level. Results We identified regulators of tmem59 during the differentiation of mouse NSCs from a compendium of expression profiles. Based on the microarray experiment, we developed the parallelized SWNI algorithm to reconstruct gene regulatory networks of mouse neural stem cells. From the inferred tmem59 related gene network including 36 genes, pou6f1 was identified to regulate tmem59 significantly and might play an important role in the differentiation of NSCs in mouse brain. There are four pathways shown in the gene network, indicating that tmem59 locates in the downstream of the signalling pathway. The real-time RT-PCR results shown that the over-expression of pou6f1 could significantly up-regulate tmem59 expression in C17.2 NSC line. 16 out of 36 predicted genes in our constructed network have been reported to be AD-related, including Ace, aqp1, arrdc3, cd14, cd59a, cds1, cldn1, cox8b, defb11, folr1, gdi2, mmp3, mgp, myrip, Ripk4, rnd3, and sncg. The localization of tmem59 related genes and functional-related gene groups based on the Gene Ontology (GO annotation was also identified. Conclusions Our findings suggest that the expression of tmem59 is an important factor contributing to AD. The parallelized SWNI algorithm increased the efficiency of network reconstruction significantly. This study enables us to highlight novel genes that may be involved in NSC differentiation and provides a shortcut to

  11. NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

    Science.gov (United States)

    Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

    2015-09-29

    In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.

  12. An algebra-based method for inferring gene regulatory networks.

    Science.gov (United States)

    Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

    2014-03-26

    The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the

  13. Exploring Normalization and Network Reconstruction Methods using In Silico and In Vivo Models

    Science.gov (United States)

    Abstract: Lessons learned from the recent DREAM competitions include: The search for the best network reconstruction method continues, and we need more complete datasets with ground truth from more complex organisms. It has become obvious that the network reconstruction methods t...

  14. Reconstruction of coupling architecture of neural field networks from vector time series

    Science.gov (United States)

    Sysoev, Ilya V.; Ponomarenko, Vladimir I.; Pikovsky, Arkady

    2018-04-01

    We propose a method of reconstruction of the network coupling matrix for a basic voltage-model of the neural field dynamics. Assuming that the multivariate time series of observations from all nodes are available, we describe a technique to find coupling constants which is unbiased in the limit of long observations. Furthermore, the method is generalized for reconstruction of networks with time-delayed coupling, including the reconstruction of unknown time delays. The approach is compared with other recently proposed techniques.

  15. Reconstructing Causal Biological Networks through Active Learning.

    Directory of Open Access Journals (Sweden)

    Hyunghoon Cho

    Full Text Available Reverse-engineering of biological networks is a central problem in systems biology. The use of intervention data, such as gene knockouts or knockdowns, is typically used for teasing apart causal relationships among genes. Under time or resource constraints, one needs to carefully choose which intervention experiments to carry out. Previous approaches for selecting most informative interventions have largely been focused on discrete Bayesian networks. However, continuous Bayesian networks are of great practical interest, especially in the study of complex biological systems and their quantitative properties. In this work, we present an efficient, information-theoretic active learning algorithm for Gaussian Bayesian networks (GBNs, which serve as important models for gene regulatory networks. In addition to providing linear-algebraic insights unique to GBNs, leading to significant runtime improvements, we demonstrate the effectiveness of our method on data simulated with GBNs and the DREAM4 network inference challenge data sets. Our method generally leads to faster recovery of underlying network structure and faster convergence to final distribution of confidence scores over candidate graph structures using the full data, in comparison to random selection of intervention experiments.

  16. Reconstructing transcriptional regulatory networks through genomics data

    OpenAIRE

    Sun, Ning; Zhao, Hongyu

    2009-01-01

    One central problem in biology is to understand how gene expression is regulated under different conditions. Microarray gene expression data and other high throughput data have made it possible to dissect transcriptional regulatory networks at the genomics level. Owing to the very large number of genes that need to be studied, the relatively small number of data sets available, the noise in the data and the different natures of the distinct data types, network inference presents great challen...

  17. MR fingerprinting Deep RecOnstruction NEtwork (DRONE).

    Science.gov (United States)

    Cohen, Ouri; Zhu, Bo; Rosen, Matthew S

    2018-09-01

    Demonstrate a novel fast method for reconstruction of multi-dimensional MR fingerprinting (MRF) data using deep learning methods. A neural network (NN) is defined using the TensorFlow framework and trained on simulated MRF data computed with the extended phase graph formalism. The NN reconstruction accuracy for noiseless and noisy data is compared to conventional MRF template matching as a function of training data size and is quantified in simulated numerical brain phantom data and International Society for Magnetic Resonance in Medicine/National Institute of Standards and Technology phantom data measured on 1.5T and 3T scanners with an optimized MRF EPI and MRF fast imaging with steady state precession (FISP) sequences with spiral readout. The utility of the method is demonstrated in a healthy subject in vivo at 1.5T. Network training required 10 to 74 minutes; once trained, data reconstruction required approximately 10 ms for the MRF EPI and 76 ms for the MRF FISP sequence. Reconstruction of simulated, noiseless brain data using the NN resulted in a RMS error (RMSE) of 2.6 ms for T 1 and 1.9 ms for T 2 . The reconstruction error in the presence of noise was less than 10% for both T 1 and T 2 for SNR greater than 25 dB. Phantom measurements yielded good agreement (R 2  = 0.99/0.99 for MRF EPI T 1 /T 2 and 0.94/0.98 for MRF FISP T 1 /T 2 ) between the T 1 and T 2 estimated by the NN and reference values from the International Society for Magnetic Resonance in Medicine/National Institute of Standards and Technology phantom. Reconstruction of MRF data with a NN is accurate, 300- to 5000-fold faster, and more robust to noise and dictionary undersampling than conventional MRF dictionary-matching. © 2018 International Society for Magnetic Resonance in Medicine.

  18. Automatic reconstruction of fault networks from seismicity catalogs including location uncertainty

    International Nuclear Information System (INIS)

    Wang, Y.

    2013-01-01

    Within the framework of plate tectonics, the deformation that arises from the relative movement of two plates occurs across discontinuities in the earth's crust, known as fault zones. Active fault zones are the causal locations of most earthquakes, which suddenly release tectonic stresses within a very short time. In return, fault zones slowly grow by accumulating slip due to such earthquakes by cumulated damage at their tips, and by branching or linking between pre-existing faults of various sizes. Over the last decades, a large amount of knowledge has been acquired concerning the overall phenomenology and mechanics of individual faults and earthquakes: A deep physical and mechanical understanding of the links and interactions between and among them is still missing, however. One of the main issues lies in our failure to always succeed in assigning an earthquake to its causative fault. Using approaches based in pattern-recognition theory, more insight into the relationship between earthquakes and fault structure can be gained by developing an automatic fault network reconstruction approach using high resolution earthquake data sets at largely different scales and by considering individual event uncertainties. This thesis introduces the Anisotropic Clustering of Location Uncertainty Distributions (ACLUD) method to reconstruct active fault networks on the basis of both earthquake locations and their estimated individual uncertainties. This method consists in fitting a given set of hypocenters with an increasing amount of finite planes until the residuals of the fit compare with location uncertainties. After a massive search through the large solution space of possible reconstructed fault networks, six different validation procedures are applied in order to select the corresponding best fault network. Two of the validation steps (cross-validation and Bayesian Information Criterion (BIC)) process the fit residuals, while the four others look for solutions that

  19. Automatic reconstruction of fault networks from seismicity catalogs including location uncertainty

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Y.

    2013-07-01

    Within the framework of plate tectonics, the deformation that arises from the relative movement of two plates occurs across discontinuities in the earth's crust, known as fault zones. Active fault zones are the causal locations of most earthquakes, which suddenly release tectonic stresses within a very short time. In return, fault zones slowly grow by accumulating slip due to such earthquakes by cumulated damage at their tips, and by branching or linking between pre-existing faults of various sizes. Over the last decades, a large amount of knowledge has been acquired concerning the overall phenomenology and mechanics of individual faults and earthquakes: A deep physical and mechanical understanding of the links and interactions between and among them is still missing, however. One of the main issues lies in our failure to always succeed in assigning an earthquake to its causative fault. Using approaches based in pattern-recognition theory, more insight into the relationship between earthquakes and fault structure can be gained by developing an automatic fault network reconstruction approach using high resolution earthquake data sets at largely different scales and by considering individual event uncertainties. This thesis introduces the Anisotropic Clustering of Location Uncertainty Distributions (ACLUD) method to reconstruct active fault networks on the basis of both earthquake locations and their estimated individual uncertainties. This method consists in fitting a given set of hypocenters with an increasing amount of finite planes until the residuals of the fit compare with location uncertainties. After a massive search through the large solution space of possible reconstructed fault networks, six different validation procedures are applied in order to select the corresponding best fault network. Two of the validation steps (cross-validation and Bayesian Information Criterion (BIC)) process the fit residuals, while the four others look for solutions that

  20. Reconstruction of Ancestral Genomes in Presence of Gene Gain and Loss.

    Science.gov (United States)

    Avdeyev, Pavel; Jiang, Shuai; Aganezov, Sergey; Hu, Fei; Alekseyev, Max A

    2016-03-01

    Since most dramatic genomic changes are caused by genome rearrangements as well as gene duplications and gain/loss events, it becomes crucial to understand their mechanisms and reconstruct ancestral genomes of the given genomes. This problem was shown to be NP-complete even in the "simplest" case of three genomes, thus calling for heuristic rather than exact algorithmic solutions. At the same time, a larger number of input genomes may actually simplify the problem in practice as it was earlier illustrated with MGRA, a state-of-the-art software tool for reconstruction of ancestral genomes of multiple genomes. One of the key obstacles for MGRA and other similar tools is presence of breakpoint reuses when the same breakpoint region is broken by several different genome rearrangements in the course of evolution. Furthermore, such tools are often limited to genomes composed of the same genes with each gene present in a single copy in every genome. This limitation makes these tools inapplicable for many biological datasets and degrades the resolution of ancestral reconstructions in diverse datasets. We address these deficiencies by extending the MGRA algorithm to genomes with unequal gene contents. The developed next-generation tool MGRA2 can handle gene gain/loss events and shares the ability of MGRA to reconstruct ancestral genomes uniquely in the case of limited breakpoint reuse. Furthermore, MGRA2 employs a number of novel heuristics to cope with higher breakpoint reuse and process datasets inaccessible for MGRA. In practical experiments, MGRA2 shows superior performance for simulated and real genomes as compared to other ancestral genome reconstruction tools.

  1. Genome-wide identification of regulatory elements and reconstruction of gene regulatory networks of the green alga Chlamydomonas reinhardtii under carbon deprivation.

    Directory of Open Access Journals (Sweden)

    Flavia Vischi Winck

    Full Text Available The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1 gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF and transcription regulator (TR genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1 and Lcr2 (Low-CO2 response regulator 2, may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome

  2. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

    Science.gov (United States)

    Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

    2014-01-16

    To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high

  3. Reconstruction of financial networks for robust estimation of systemic risk

    Science.gov (United States)

    Mastromatteo, Iacopo; Zarinelli, Elia; Marsili, Matteo

    2012-03-01

    In this paper we estimate the propagation of liquidity shocks through interbank markets when the information about the underlying credit network is incomplete. We show that techniques such as maximum entropy currently used to reconstruct credit networks severely underestimate the risk of contagion by assuming a trivial (fully connected) topology, a type of network structure which can be very different from the one empirically observed. We propose an efficient message-passing algorithm to explore the space of possible network structures and show that a correct estimation of the network degree of connectedness leads to more reliable estimations for systemic risk. Such an algorithm is also able to produce maximally fragile structures, providing a practical upper bound for the risk of contagion when the actual network structure is unknown. We test our algorithm on ensembles of synthetic data encoding some features of real financial networks (sparsity and heterogeneity), finding that more accurate estimations of risk can be achieved. Finally we find that this algorithm can be used to control the amount of information that regulators need to require from banks in order to sufficiently constrain the reconstruction of financial networks.

  4. Reconstruction of financial networks for robust estimation of systemic risk

    International Nuclear Information System (INIS)

    Mastromatteo, Iacopo; Zarinelli, Elia; Marsili, Matteo

    2012-01-01

    In this paper we estimate the propagation of liquidity shocks through interbank markets when the information about the underlying credit network is incomplete. We show that techniques such as maximum entropy currently used to reconstruct credit networks severely underestimate the risk of contagion by assuming a trivial (fully connected) topology, a type of network structure which can be very different from the one empirically observed. We propose an efficient message-passing algorithm to explore the space of possible network structures and show that a correct estimation of the network degree of connectedness leads to more reliable estimations for systemic risk. Such an algorithm is also able to produce maximally fragile structures, providing a practical upper bound for the risk of contagion when the actual network structure is unknown. We test our algorithm on ensembles of synthetic data encoding some features of real financial networks (sparsity and heterogeneity), finding that more accurate estimations of risk can be achieved. Finally we find that this algorithm can be used to control the amount of information that regulators need to require from banks in order to sufficiently constrain the reconstruction of financial networks

  5. Dynamic Network Reconstruction from Gene Expression Data Describing the Effect of LiCl Stimulation on Hepatocytes

    Directory of Open Access Journals (Sweden)

    Zellmer Sebastian

    2005-12-01

    Full Text Available Wnt/β-catenin signalling plays an important role in zonation of liver parenchyma and in patterning of hepatocyte heterogeneity. A characteristic marker of this heterogeneity is glutamine synthetase, which is expressed only in a subset of pericentrally located hepatocytes. To investigate, whether and how the Wnt/β-catenin signalling pathway is involved a culture of hepatocytes was stimulated by LiCl. This resulted in an increase in the specific GS activity, indicating that the Wnt/β-catenin pathway may participate in regulating GS levels. Affymetrix GeneChip oligonucleotide arrays were used to monitor the gene expression changes during a period from 2 to 24 hours after stimulation by LiCl. Samples from a cultivation without stimulation were used as controls. Based on the gene expression profiles a hypothetic signal transduction network was constructed by a reverse engineering algorithm. The network robustness was tested and the most stable structure was identified.

  6. Reconstruction of Micropattern Detector Signals using Convolutional Neural Networks

    Science.gov (United States)

    Flekova, L.; Schott, M.

    2017-10-01

    Micropattern gaseous detector (MPGD) technologies, such as GEMs or MicroMegas, are particularly suitable for precision tracking and triggering in high rate environments. Given their relatively low production costs, MPGDs are an exemplary candidate for the next generation of particle detectors. Having acknowledged these advantages, both the ATLAS and CMS collaborations at the LHC are exploiting these new technologies for their detector upgrade programs in the coming years. When MPGDs are utilized for triggering purposes, the measured signals need to be precisely reconstructed within less than 200 ns, which can be achieved by the usage of FPGAs. In this work, we present a novel approach to identify reconstructed signals, their timing and the corresponding spatial position on the detector. In particular, we study the effect of noise and dead readout strips on the reconstruction performance. Our approach leverages the potential of convolutional neural network (CNNs), which have recently manifested an outstanding performance in a range of modeling tasks. The proposed neural network architecture of our CNN is designed simply enough, so that it can be modeled directly by an FPGA and thus provide precise information on reconstructed signals already in trigger level.

  7. A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology

    NARCIS (Netherlands)

    HerrgĂĄrd, Markus J.; Swainston, Neil; Dobson, Paul; Dunn, Warwick B.; Arga, K. Yalçin; Arvas, Mikko; BlĂĽthgen, Nils; Borger, Simon; Costenoble, Roeland; Heinemann, Matthias; Hucka, Michael; Novère, Nicolas Le; Li, Peter; Liebermeister, Wolfram; Mo, Monica L.; Oliveira, Ana Paula; Petranovic, Dina; Pettifer, Stephen; Simeonidis, Evangelos; Smallbone, Kieran; Spasić, Irena; Weichart, Dieter; Brent, Roger; Broomhead, David S.; Westerhoff, Hans V.; Kırdar, BetĂĽl; Penttilä, Merja; Klipp, Edda; Palsson, Bernhard Ă.; Sauer, Uwe; Oliver, Stephen G.; Mendes, Pedro; Nielsen, Jens; Kell, Douglas B.

    2008-01-01

    Genomic data allow the large-scale manual or semi-automated assembly of metabolic network reconstructions, which provide highly curated organism-specific knowledge bases. Although several genome-scale network reconstructions describe Saccharomyces cerevisiae metabolism, they differ in scope and

  8. Virtual resistive network and conductivity reconstruction with Faraday's law

    International Nuclear Information System (INIS)

    Lee, Min Gi; Ko, Min-Su; Kim, Yong-Jung

    2014-01-01

    A network-based conductivity reconstruction method is introduced using the third Maxwell equation, or Faraday's law, for a static case. The usual choice in electrical impedance tomography is the divergence-free equation for the electrical current density. However, if the electrical current density is given, the curl-free equation for the electrical field gives a direct relation between the current and the conductivity and this relation is used in this paper. Mimetic discretization is applied to the equation, which gives the virtual resistive network system. Properties of the numerical schemes introduced are investigated and their advantages over other conductivity reconstruction methods are discussed. Numerically simulated results, with an analysis of noise propagation, are presented. (paper)

  9. Statistical inference approach to structural reconstruction of complex networks from binary time series

    Science.gov (United States)

    Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

    2018-02-01

    Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.

  10. Network reconstruction via graph blending

    Science.gov (United States)

    Estrada, Rolando

    2016-05-01

    Graphs estimated from empirical data are often noisy and incomplete due to the difficulty of faithfully observing all the components (nodes and edges) of the true graph. This problem is particularly acute for large networks where the number of components may far exceed available surveillance capabilities. Errors in the observed graph can render subsequent analyses invalid, so it is vital to develop robust methods that can minimize these observational errors. Errors in the observed graph may include missing and spurious components, as well fused (multiple nodes are merged into one) and split (a single node is misinterpreted as many) nodes. Traditional graph reconstruction methods are only able to identify missing or spurious components (primarily edges, and to a lesser degree nodes), so we developed a novel graph blending framework that allows us to cast the full estimation problem as a simple edge addition/deletion problem. Armed with this framework, we systematically investigate the viability of various topological graph features, such as the degree distribution or the clustering coefficients, and existing graph reconstruction methods for tackling the full estimation problem. Our experimental results suggest that incorporating any topological feature as a source of information actually hinders reconstruction accuracy. We provide a theoretical analysis of this phenomenon and suggest several avenues for improving this estimation problem.

  11. Orthotropic conductivity reconstruction with virtual-resistive network and Faraday's law

    KAUST Repository

    Lee, Min-Gi

    2015-06-01

    We obtain the existence and the uniqueness at the same time in the reconstruction of orthotropic conductivity in two-space dimensions by using two sets of internal current densities and boundary conductivity. The curl-free equation of Faraday\\'s law is taken instead of the elliptic equation in a divergence form that is typically used in electrical impedance tomography. A reconstruction method based on layered bricks-type virtual-resistive network is developed to reconstruct orthotropic conductivity with up to 40% multiplicative noise.

  12. Ekofisk chalk: core measurements, stochastic reconstruction, network modeling and simulation

    Energy Technology Data Exchange (ETDEWEB)

    Talukdar, Saifullah

    2002-07-01

    This dissertation deals with (1) experimental measurements on petrophysical, reservoir engineering and morphological properties of Ekofisk chalk, (2) numerical simulation of core flood experiments to analyze and improve relative permeability data, (3) stochastic reconstruction of chalk samples from limited morphological information, (4) extraction of pore space parameters from the reconstructed samples, development of network model using pore space information, and computation of petrophysical and reservoir engineering properties from network model, and (5) development of 2D and 3D idealized fractured reservoir models and verification of the applicability of several widely used conventional up scaling techniques in fractured reservoir simulation. Experiments have been conducted on eight Ekofisk chalk samples and porosity, absolute permeability, formation factor, and oil-water relative permeability, capillary pressure and resistivity index are measured at laboratory conditions. Mercury porosimetry data and backscatter scanning electron microscope images have also been acquired for the samples. A numerical simulation technique involving history matching of the production profiles is employed to improve the relative permeability curves and to analyze hysteresis of the Ekofisk chalk samples. The technique was found to be a powerful tool to supplement the uncertainties in experimental measurements. Porosity and correlation statistics obtained from backscatter scanning electron microscope images are used to reconstruct microstructures of chalk and particulate media. The reconstruction technique involves a simulated annealing algorithm, which can be constrained by an arbitrary number of morphological parameters. This flexibility of the algorithm is exploited to successfully reconstruct particulate media and chalk samples using more than one correlation functions. A technique based on conditional simulated annealing has been introduced for exact reproduction of vuggy

  13. Maximum-entropy networks pattern detection, network reconstruction and graph combinatorics

    CERN Document Server

    Squartini, Tiziano

    2017-01-01

    This book is an introduction to maximum-entropy models of random graphs with given topological properties and their applications. Its original contribution is the reformulation of many seemingly different problems in the study of both real networks and graph theory within the unified framework of maximum entropy. Particular emphasis is put on the detection of structural patterns in real networks, on the reconstruction of the properties of networks from partial information, and on the enumeration and sampling of graphs with given properties.  After a first introductory chapter explaining the motivation, focus, aim and message of the book, chapter 2 introduces the formal construction of maximum-entropy ensembles of graphs with local topological constraints. Chapter 3 focuses on the problem of pattern detection in real networks and provides a powerful way to disentangle nontrivial higher-order structural features from those that can be traced back to simpler local constraints. Chapter 4 focuses on the problem o...

  14. Robust Learning of High-dimensional Biological Networks with Bayesian Networks

    Science.gov (United States)

    Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

    Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.

  15. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    Energy Technology Data Exchange (ETDEWEB)

    Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)

    2014-05-20

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.

  16. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    International Nuclear Information System (INIS)

    Santra, Tapesh

    2014-01-01

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.

  17. IdentiCS – Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence

    Directory of Open Access Journals (Sweden)

    Zeng An-Ping

    2004-08-01

    Full Text Available Abstract Background A necessary step for a genome level analysis of the cellular metabolism is the in silico reconstruction of the metabolic network from genome sequences. The available methods are mainly based on the annotation of genome sequences including two successive steps, the prediction of coding sequences (CDS and their function assignment. The annotation process takes time. The available methods often encounter difficulties when dealing with unfinished error-containing genomic sequence. Results In this work a fast method is proposed to use unannotated genome sequence for predicting CDSs and for an in silico reconstruction of metabolic networks. Instead of using predicted genes or CDSs to query public databases, entries from public DNA or protein databases are used as queries to search a local database of the unannotated genome sequence to predict CDSs. Functions are assigned to the predicted CDSs simultaneously. The well-annotated genome of Salmonella typhimurium LT2 is used as an example to demonstrate the applicability of the method. 97.7% of the CDSs in the original annotation are correctly identified. The use of SWISS-PROT-TrEMBL databases resulted in an identification of 98.9% of CDSs that have EC-numbers in the published annotation. Furthermore, two versions of sequences of the bacterium Klebsiella pneumoniae with different genome coverage (3.9 and 7.9 fold, respectively are examined. The results suggest that a 3.9-fold coverage of the bacterial genome could be sufficiently used for the in silico reconstruction of the metabolic network. Compared to other gene finding methods such as CRITICA our method is more suitable for exploiting sequences of low genome coverage. Based on the new method, a program called IdentiCS (Identification of Coding Sequences from Unfinished Genome Sequences is delivered that combines the identification of CDSs with the reconstruction, comparison and visualization of metabolic networks (free to download

  18. Cellular neural networks, the Navier-Stokes equation, and microarray image reconstruction.

    Science.gov (United States)

    Zineddin, Bachar; Wang, Zidong; Liu, Xiaohui

    2011-11-01

    Although the last decade has witnessed a great deal of improvements achieved for the microarray technology, many major developments in all the main stages of this technology, including image processing, are still needed. Some hardware implementations of microarray image processing have been proposed in the literature and proved to be promising alternatives to the currently available software systems. However, the main drawback of those proposed approaches is the unsuitable addressing of the quantification of the gene spot in a realistic way without any assumption about the image surface. Our aim in this paper is to present a new image-reconstruction algorithm using the cellular neural network that solves the Navier-Stokes equation. This algorithm offers a robust method for estimating the background signal within the gene-spot region. The MATCNN toolbox for Matlab is used to test the proposed method. Quantitative comparisons are carried out, i.e., in terms of objective criteria, between our approach and some other available methods. It is shown that the proposed algorithm gives highly accurate and realistic measurements in a fully automated manner within a remarkably efficient time.

  19. Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

    Directory of Open Access Journals (Sweden)

    Gao Shouguo

    2011-08-01

    Full Text Available Abstract Background Bayesian Network (BN is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable. Results We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the NaĂŻve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information. Conclusion our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.

  20. Gene coexpression network analysis as a source of functional annotation for rice genes.

    Directory of Open Access Journals (Sweden)

    Kevin L Childs

    Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional

  1. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network

    DEFF Research Database (Denmark)

    Förster, Jochen; Famili, I.; Fu, P.

    2003-01-01

    The metabolic network in the yeast Saccharomyces cerevisiae was reconstructed using currently available genomic, biochemical, and physiological information. The metabolic reactions were compartmentalized between the cytosol and the mitochondria, and transport steps between the compartments...

  2. Interactive visualization of gene regulatory networks with associated gene expression time series data

    NARCIS (Netherlands)

    Westenberg, M.A.; Hijum, van S.A.F.T.; Lulko, A.T.; Kuipers, O.P.; Roerdink, J.B.T.M.; Linsen, L.; Hagen, H.; Hamann, B.

    2008-01-01

    We present GENeVis, an application to visualize gene expression time series data in a gene regulatory network context. This is a network of regulator proteins that regulate the expression of their respective target genes. The networks are represented as graphs, in which the nodes represent genes,

  3. Inferring gene networks from discrete expression data

    KAUST Repository

    Zhang, L.

    2013-07-18

    The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.

  4. Current approaches to gene regulatory network modelling

    Directory of Open Access Journals (Sweden)

    Brazma Alvis

    2007-09-01

    Full Text Available Abstract Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model.

  5. Empirical Bayes conditional independence graphs for regulatory network recovery

    Science.gov (United States)

    Mahdi, Rami; Madduri, Abishek S.; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R.; Crystal, Ronald G.; Mezey, Jason G.

    2012-01-01

    Motivation: Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. Methods: We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Results: Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Availability and implementation: Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. Contact: ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22685074

  6. Genes2FANs: connecting genes through functional association networks

    Science.gov (United States)

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  7. A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Xiangyun Xiao

    Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.

  8. A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

    Science.gov (United States)

    Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

    2015-01-01

    The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.

  9. The Convolutional Visual Network for Identification and Reconstruction of NOvA Events

    Energy Technology Data Exchange (ETDEWEB)

    Psihas, Fernanda [Indiana U.

    2017-11-22

    In 2016 the NOvA experiment released results for the observation of oscillations in the vμ and ve channels as well as ve cross section measurements using neutrinos from Fermilab’s NuMI beam. These and other measurements in progress rely on the accurate identification and reconstruction of the neutrino flavor and energy recorded by our detectors. This presentation describes the first application of convolutional neural network technology for event identification and reconstruction in particle detectors like NOvA. The Convolutional Visual Network (CVN) Algorithm was developed for identification, categorization, and reconstruction of NOvA events. It increased the selection efficiency of the ve appearance signal by 40% and studies show potential impact to the vμ disappearance analysis.

  10. Integrated Approach to Reconstruction of Microbial Regulatory Networks

    Energy Technology Data Exchange (ETDEWEB)

    Rodionov, Dmitry A [Sanford-Burnham Medical Research Institute; Novichkov, Pavel S [Lawrence Berkeley National Laboratory

    2013-11-04

    This project had the goal(s) of development of integrated bioinformatics platform for genome-scale inference and visualization of transcriptional regulatory networks (TRNs) in bacterial genomes. The work was done in Sanford-Burnham Medical Research Institute (SBMRI, P.I. D.A. Rodionov) and Lawrence Berkeley National Laboratory (LBNL, co-P.I. P.S. Novichkov). The developed computational resources include: (1) RegPredict web-platform for TRN inference and regulon reconstruction in microbial genomes, and (2) RegPrecise database for collection, visualization and comparative analysis of transcriptional regulons reconstructed by comparative genomics. These analytical resources were selected as key components in the DOE Systems Biology KnowledgeBase (SBKB). The high-quality data accumulated in RegPrecise will provide essential datasets of reference regulons in diverse microbes to enable automatic reconstruction of draft TRNs in newly sequenced genomes. We outline our progress toward the three aims of this grant proposal, which were: Develop integrated platform for genome-scale regulon reconstruction; Infer regulatory annotations in several groups of bacteria and building of reference collections of microbial regulons; and Develop KnowledgeBase on microbial transcriptional regulation.

  11. Efficient network reconstruction from dynamical cascades identifies small-world topology of neuronal avalanches.

    Directory of Open Access Journals (Sweden)

    Sinisa Pajevic

    2009-01-01

    Full Text Available Cascading activity is commonly found in complex systems with directed interactions such as metabolic networks, neuronal networks, or disease spreading in social networks. Substantial insight into a system's organization can be obtained by reconstructing the underlying functional network architecture from the observed activity cascades. Here we focus on Bayesian approaches and reduce their computational demands by introducing the Iterative Bayesian (IB and Posterior Weighted Averaging (PWA methods. We introduce a special case of PWA, cast in nonparametric form, which we call the normalized count (NC algorithm. NC efficiently reconstructs random and small-world functional network topologies and architectures from subcritical, critical, and supercritical cascading dynamics and yields significant improvements over commonly used correlation methods. With experimental data, NC identified a functional and structural small-world topology and its corresponding traffic in cortical networks with neuronal avalanche dynamics.

  12. Integration of expression data in genome-scale metabolic network reconstructions

    Directory of Open Access Journals (Sweden)

    Anna S. Blazier

    2012-08-01

    Full Text Available With the advent of high-throughput technologies, the field of systems biology has amassed an abundance of omics data, quantifying thousands of cellular components across a variety of scales, ranging from mRNA transcript levels to metabolite quantities. Methods are needed to not only integrate this omics data but to also use this data to heighten the predictive capabilities of computational models. Several recent studies have successfully demonstrated how flux balance analysis (FBA, a constraint-based modeling approach, can be used to integrate transcriptomic data into genome-scale metabolic network reconstructions to generate predictive computational models. In this review, we summarize such FBA-based methods for integrating expression data into genome-scale metabolic network reconstructions, highlighting their advantages as well as their limitations.

  13. Reconstructing Late Holocene North Atlantic atmospheric circulation changes using functional paleoclimate networks

    Science.gov (United States)

    Franke, Jasper G.; Werner, Johannes P.; Donner, Reik V.

    2017-11-01

    Obtaining reliable reconstructions of long-term atmospheric circulation changes in the North Atlantic region presents a persistent challenge to contemporary paleoclimate research, which has been addressed by a multitude of recent studies. In order to contribute a novel methodological aspect to this active field, we apply here evolving functional network analysis, a recently developed tool for studying temporal changes of the spatial co-variability structure of the Earth's climate system, to a set of Late Holocene paleoclimate proxy records covering the last two millennia. The emerging patterns obtained by our analysis are related to long-term changes in the dominant mode of atmospheric circulation in the region, the North Atlantic Oscillation (NAO). By comparing the time-dependent inter-regional linkage structures of the obtained functional paleoclimate network representations to a recent multi-centennial NAO reconstruction, we identify co-variability between southern Greenland, Svalbard, and Fennoscandia as being indicative of a positive NAO phase, while connections from Greenland and Fennoscandia to central Europe are more pronounced during negative NAO phases. By drawing upon this correspondence, we use some key parameters of the evolving network structure to obtain a qualitative reconstruction of the NAO long-term variability over the entire Common Era (last 2000 years) using a linear regression model trained upon the existing shorter reconstruction.

  14. Data-Driven Neural Network Model for Robust Reconstruction of Automobile Casting

    Science.gov (United States)

    Lin, Jinhua; Wang, Yanjie; Li, Xin; Wang, Lu

    2017-09-01

    In computer vision system, it is a challenging task to robustly reconstruct complex 3D geometries of automobile castings. However, 3D scanning data is usually interfered by noises, the scanning resolution is low, these effects normally lead to incomplete matching and drift phenomenon. In order to solve these problems, a data-driven local geometric learning model is proposed to achieve robust reconstruction of automobile casting. In order to relieve the interference of sensor noise and to be compatible with incomplete scanning data, a 3D convolution neural network is established to match the local geometric features of automobile casting. The proposed neural network combines the geometric feature representation with the correlation metric function to robustly match the local correspondence. We use the truncated distance field(TDF) around the key point to represent the 3D surface of casting geometry, so that the model can be directly embedded into the 3D space to learn the geometric feature representation; Finally, the training labels is automatically generated for depth learning based on the existing RGB-D reconstruction algorithm, which accesses to the same global key matching descriptor. The experimental results show that the matching accuracy of our network is 92.2% for automobile castings, the closed loop rate is about 74.0% when the matching tolerance threshold Ď„ is 0.2. The matching descriptors performed well and retained 81.6% matching accuracy at 95% closed loop. For the sparse geometric castings with initial matching failure, the 3D matching object can be reconstructed robustly by training the key descriptors. Our method performs 3D reconstruction robustly for complex automobile castings.

  15. Reconstructing a Network of Stress-Response Regulators via Dynamic System Modeling of Gene Regulation

    Directory of Open Access Journals (Sweden)

    Wei-Sheng Wu

    2008-01-01

    Full Text Available Unicellular organisms such as yeasts have evolved mechanisms to respond to environmental stresses by rapidly reorganizing the gene expression program. Although many stress-response genes in yeast have been discovered by DNA microarrays, the stress-response transcription factors (TFs that regulate these stress-response genes remain to be investigated. In this study, we use a dynamic system model of gene regulation to describe the mechanism of how TFs may control a gene’s expression. Then, based on the dynamic system model, we develop the Stress Regulator Identification Algorithm (SRIA to identify stress-response TFs for six kinds of stresses. We identified some general stress-response TFs that respond to various stresses and some specific stress-response TFs that respond to one specifi c stress. The biological significance of our findings is validated by the literature. We found that a small number of TFs is probably suffi cient to control a wide variety of expression patterns in yeast under different stresses. Two implications can be inferred from this observation. First, the response mechanisms to different stresses may have a bow-tie structure. Second, there may be regulatory cross-talks among different stress responses. In conclusion, this study proposes a network of stress-response regulators and the details of their actions.

  16. [Reconstruction of Leptospira interrogans lipL21 gene and characteristics of its expression product].

    Science.gov (United States)

    Luo, Dong-jiao; Hu, Ye; Dennin, R H; Yan, Jie

    2007-09-01

    To reconstruct the nucleotide sequence of Leptospira interrogans lipL21 gene for increasing the output of prokaryotic expression and to understand the changes on immunogenicity of the expression products before and after reconstruction, and to determine the position of envelope lipoprotein LipL21 on the surface of leptospiral body. According to the preferred codons of E.coli, the nucleotide sequence of lipL21 gene was designed and synthesized, and then its prokaryotic expression system was constructed. By using SDS-PAGE plus BioRad agarose image analysor, the expression level changes of lipL21 genes before and after reconstruction were measured. A Western blot assay using rabbit anti-TR/Patoc I serum as the first antibody was performed to identify the immunoreactivity of the two target recombinant proteins rLipL21s before and after reconstruction. The changes of cross agglutination titers of antisera against two rLipL21s before and after reconstruction to the different leptospiral serogroups were demonstrated using microscope agglutination test (MAT). Immuno-electronmicroscopy was applied to confirm the location of LipL21s. The expression outputs of original and reconstructed lipL21 genes were 8.5 % and 46.5 % of the total bacterial proteins, respectively. Both the two rLipL21s could take place immune conjugation reaction with TR/Patoc I antiserum. After immunization with each of the two rLipL21s in rabbits, the animals could produce specific antibody. Similar MAT titers with 1:80 - 1:320 of the two antisera against rLipL21s were present. LipL21 was confirmed to locate on the surface of leptospiral envelope. LipL21 is a superficial antigen of Leptospira interrogans. The expression output of the reconstructed lipL21 gene is remarkably increased. The expression rLipL21 maintains fine antigenicity and immunoreactivity and its antibody still shows an extensive cross immunoagglutination activity. The high expression of the reconstructed lipL21 gene will offer a

  17. Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

    Science.gov (United States)

    Kim, Woonsu; Park, Hyesun; Seo, Seongwon

    2016-01-01

    The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID

  18. Reconstruction of three-dimensional porous media using generative adversarial neural networks

    Science.gov (United States)

    Mosser, Lukas; Dubrule, Olivier; Blunt, Martin J.

    2017-10-01

    To evaluate the variability of multiphase flow properties of porous media at the pore scale, it is necessary to acquire a number of representative samples of the void-solid structure. While modern x-ray computer tomography has made it possible to extract three-dimensional images of the pore space, assessment of the variability in the inherent material properties is often experimentally not feasible. We present a method to reconstruct the solid-void structure of porous media by applying a generative neural network that allows an implicit description of the probability distribution represented by three-dimensional image data sets. We show, by using an adversarial learning approach for neural networks, that this method of unsupervised learning is able to generate representative samples of porous media that honor their statistics. We successfully compare measures of pore morphology, such as the Euler characteristic, two-point statistics, and directional single-phase permeability of synthetic realizations with the calculated properties of a bead pack, Berea sandstone, and Ketton limestone. Results show that generative adversarial networks can be used to reconstruct high-resolution three-dimensional images of porous media at different scales that are representative of the morphology of the images used to train the neural network. The fully convolutional nature of the trained neural network allows the generation of large samples while maintaining computational efficiency. Compared to classical stochastic methods of image reconstruction, the implicit representation of the learned data distribution can be stored and reused to generate multiple realizations of the pore structure very rapidly.

  19. A homologous mapping method for three-dimensional reconstruction of protein networks reveals disease-associated mutations.

    Science.gov (United States)

    Huang, Sing-Han; Lo, Yu-Shu; Luo, Yong-Chun; Tseng, Yu-Yao; Yang, Jinn-Moon

    2018-03-19

    One of the crucial steps toward understanding the associations among molecular interactions, pathways, and diseases in a cell is to investigate detailed atomic protein-protein interactions (PPIs) in the structural interactome. Despite the availability of large-scale methods for analyzing PPI networks, these methods often focused on PPI networks using genome-scale data and/or known experimental PPIs. However, these methods are unable to provide structurally resolved interaction residues and their conservations in PPI networks. Here, we reconstructed a human three-dimensional (3D) structural PPI network (hDiSNet) with the detailed atomic binding models and disease-associated mutations by enhancing our PPI families and 3D-domain interologs from 60,618 structural complexes and complete genome database with 6,352,363 protein sequences across 2274 species. hDiSNet is a scale-free network (γ = 2.05), which consists of 5177 proteins and 19,239 PPIs with 5843 mutations. These 19,239 structurally resolved PPIs not only expanded the number of PPIs compared to present structural PPI network, but also achieved higher agreement with gene ontology similarities and higher co-expression correlation than the ones of 181,868 experimental PPIs recorded in public databases. Among 5843 mutations, 1653 and 790 mutations involved in interacting domains and contacting residues, respectively, are highly related to diseases. Our hDiSNet can provide detailed atomic interactions of human disease and their associated proteins with mutations. Our results show that the disease-related mutations are often located at the contacting residues forming the hydrogen bonds or conserved in the PPI family. In addition, hDiSNet provides the insights of the FGFR (EGFR)-MAPK pathway for interpreting the mechanisms of breast cancer and ErbB signaling pathway in brain cancer. Our results demonstrate that hDiSNet can explore structural-based interactions insights for understanding the mechanisms of disease

  20. Computing autocatalytic sets to unravel inconsistencies in metabolic network reconstructions

    DEFF Research Database (Denmark)

    Schmidt, R.; Waschina, S.; Boettger-Schmidt, D.

    2015-01-01

    , the method we report represents a powerful tool to identify inconsistencies in large-scale metabolic networks. AVAILABILITY AND IMPLEMENTATION: The method is available as source code on http://users.minet.uni-jena.de/ approximately m3kach/ASBIG/ASBIG.zip. CONTACT: christoph.kaleta@uni-jena.de SUPPLEMENTARY...... by inherent inconsistencies and gaps. RESULTS: Here we present a novel method to validate metabolic network reconstructions based on the concept of autocatalytic sets. Autocatalytic sets correspond to collections of metabolites that, besides enzymes and a growth medium, are required to produce all biomass...... components in a metabolic model. These autocatalytic sets are well-conserved across all domains of life, and their identification in specific genome-scale reconstructions allows us to draw conclusions about potential inconsistencies in these models. The method is capable of detecting inconsistencies, which...

  1. Enhanced capital-asset pricing model for the reconstruction of bipartite financial networks

    NARCIS (Netherlands)

    Squartini, Tiziano; Almog, Assaf; Caldarelli, Guido; Van Lelyveld, Iman; Garlaschelli, Diego; Cimini, Giulio

    2017-01-01

    Reconstructing patterns of interconnections from partial information is one of the most important issues in the statistical physics of complex networks. A paramount example is provided by financial networks. In fact, the spreading and amplification of financial distress in capital markets are

  2. Insights gained from the reverse engineering of gene networks in keloid fibroblasts

    Directory of Open Access Journals (Sweden)

    Phan Toan

    2011-05-01

    Full Text Available Abstract Background Keloids are protrusive claw-like scars that have a propensity to recur even after surgery, and its molecular etiology remains elusive. The goal of reverse engineering is to infer gene networks from observational data, thus providing insight into the inner workings of a cell. However, most attempts at modeling biological networks have been done using simulated data. This study aims to highlight some of the issues involved in working with experimental data, and at the same time gain some insights into the transcriptional regulatory mechanism present in keloid fibroblasts. Methods Microarray data from our previous study was combined with microarray data obtained from the literature as well as new microarray data generated by our group. For the physical approach, we used the fREDUCE algorithm for correlating expression values to binding motifs. For the influence approach, we compared the Bayesian algorithm BANJO with the information theoretic method ARACNE in terms of performance in recovering known influence networks obtained from the KEGG database. In addition, we also compared the performance of different normalization methods as well as different types of gene networks. Results Using the physical approach, we found consensus sequences that were active in the keloid condition, as well as some sequences that were responsive to steroids, a commonly used treatment for keloids. From the influence approach, we found that BANJO was better at recovering the gene networks compared to ARACNE and that transcriptional networks were better suited for network recovery compared to cytokine-receptor interaction networks and intracellular signaling networks. We also found that the NFKB transcriptional network that was inferred from normal fibroblast data was more accurate compared to that inferred from keloid data, suggesting a more robust network in the keloid condition. Conclusions Consensus sequences that were found from this study are

  3. Reconstruction, visualization and explorative analysis of human pluripotency network

    Directory of Open Access Journals (Sweden)

    Priyanka Narad

    2017-09-01

    Full Text Available Identification of genes/proteins involved in pluripotency and their inter-relationships is important for understanding the induction/loss and maintenance of pluripotency. With the availability of large volume of data on interaction/regulation of pluripotency scattered across a large number of biological databases and hundreds of scientific journals, it is required a systematic integration of data which will create a complete view of pluripotency network. Describing and interpreting such a network of interaction and regulation (i.e., stimulation and inhibition links are essential tasks of computational biology, an important first step in systems-level understanding of the underlying mechanisms of pluripotency. To address this, we have assembled a network of 166 molecular interactions, stimulations and inhibitions, based on a collection of research data from 147 publications, involving 122 human genes/proteins, all in a standard electronic format, enabling analyses by readily available software such as Cytoscape and its Apps (formerly called "Plugins". The network includes the core circuit of OCT4 (POU5F1, SOX2 and NANOG, its periphery (such as STAT3, KLF4, UTF1, ZIC3, and c-MYC, connections to upstream signaling pathways (such as ACTIVIN, WNT, FGF, and BMP, and epigenetic regulators (such as L1TD1, LSD1 and PRC2. We describe the general properties of the network and compare it with other literature-based networks. Gene Ontology (GO analysis is being performed to find out the over-represented GO terms in the network. We use several expression datasets to condense the network to a set of network links that identify the key players (genes/proteins and the pathways involved in transition from one state of pluripotency to other state (i.e., native to primed state, primed to non-pluripotent state and pluripotent to non-pluripotent state.

  4. SCENERY: a web application for (causal) network reconstruction from cytometry data

    KAUST Repository

    Papoutsoglou, Georgios

    2017-05-08

    Flow and mass cytometry technologies can probe proteins as biological markers in thousands of individual cells simultaneously, providing unprecedented opportunities for reconstructing networks of protein interactions through machine learning algorithms. The network reconstruction (NR) problem has been well-studied by the machine learning community. However, the potentials of available methods remain largely unknown to the cytometry community, mainly due to their intrinsic complexity and the lack of comprehensive, powerful and easy-to-use NR software implementations specific for cytometry data. To bridge this gap, we present Single CEll NEtwork Reconstruction sYstem (SCENERY), a web server featuring several standard and advanced cytometry data analysis methods coupled with NR algorithms in a user-friendly, on-line environment. In SCENERY, users may upload their data and set their own study design. The server offers several data analysis options categorized into three classes of methods: data (pre)processing, statistical analysis and NR. The server also provides interactive visualization and download of results as ready-to-publish images or multimedia reports. Its core is modular and based on the widely-used and robust R platform allowing power users to extend its functionalities by submitting their own NR methods. SCENERY is available at scenery.csd.uoc.gr or http://mensxmachina.org/en/software/.

  5. Finding gene regulatory network candidates using the gene expression knowledge base.

    Science.gov (United States)

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  6. Sparsity in Model Gene Regulatory Networks

    International Nuclear Information System (INIS)

    Zagorski, M.

    2011-01-01

    We propose a gene regulatory network model which incorporates the microscopic interactions between genes and transcription factors. In particular the gene's expression level is determined by deterministic synchronous dynamics with contribution from excitatory interactions. We study the structure of networks that have a particular '' function '' and are subject to the natural selection pressure. The question of network robustness against point mutations is addressed, and we conclude that only a small part of connections defined as '' essential '' for cell's existence is fragile. Additionally, the obtained networks are sparse with narrow in-degree and broad out-degree, properties well known from experimental study of biological regulatory networks. Furthermore, during sampling procedure we observe that significantly different genotypes can emerge under mutation-selection balance. All the preceding features hold for the model parameters which lay in the experimentally relevant range. (author)

  7. Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Liang Jinghang

    2012-08-01

    Full Text Available Abstract Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs. As a logical model, probabilistic Boolean networks (PBNs consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n or O(nN2n for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN. An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n, where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a

  8. The Evolution of Gene Regulatory Networks that Define Arthropod Body Plans.

    Science.gov (United States)

    Auman, Tzach; Chipman, Ariel D

    2017-09-01

    Our understanding of the genetics of arthropod body plan development originally stems from work on Drosophila melanogaster from the late 1970s and onward. In Drosophila, there is a relatively detailed model for the network of gene interactions that proceeds in a sequential-hierarchical fashion to define the main features of the body plan. Over the years, we have a growing understanding of the networks involved in defining the body plan in an increasing number of arthropod species. It is now becoming possible to tease out the conserved aspects of these networks and to try to reconstruct their evolution. In this contribution, we focus on several key nodes of these networks, starting from early patterning in which the main axes are determined and the broad morphological domains of the embryo are defined, and on to later stage wherein the growth zone network is active in sequential addition of posterior segments. The pattern of conservation of networks is very patchy, with some key aspects being highly conserved in all arthropods and others being very labile. Many aspects of early axis patterning are highly conserved, as are some aspects of sequential segment generation. In contrast, regional patterning varies among different taxa, and some networks, such as the terminal patterning network, are only found in a limited range of taxa. The growth zone segmentation network is ancient and is probably plesiomorphic to all arthropods. In some insects, it has undergone significant modification to give rise to a more hardwired network that generates individual segments separately. In other insects and in most arthropods, the sequential segmentation network has undergone a significant amount of systems drift, wherein many of the genes have changed. However, it maintains a conserved underlying logic and function. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please

  9. FunGeneNet: a web tool to estimate enrichment of functional interactions in experimental gene sets.

    Science.gov (United States)

    Tiys, Evgeny S; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2018-02-09

    Estimation of functional connectivity in gene sets derived from genome-wide or other biological experiments is one of the essential tasks of bioinformatics. A promising approach for solving this problem is to compare gene networks built using experimental gene sets with random networks. One of the resources that make such an analysis possible is CrossTalkZ, which uses the FunCoup database. However, existing methods, including CrossTalkZ, do not take into account individual types of interactions, such as protein/protein interactions, expression regulation, transport regulation, catalytic reactions, etc., but rather work with generalized types characterizing the existence of any connection between network members. We developed the online tool FunGeneNet, which utilizes the ANDSystem and STRING to reconstruct gene networks using experimental gene sets and to estimate their difference from random networks. To compare the reconstructed networks with random ones, the node permutation algorithm implemented in CrossTalkZ was taken as a basis. To study the FunGeneNet applicability, the functional connectivity analysis of networks constructed for gene sets involved in the Gene Ontology biological processes was conducted. We showed that the method sensitivity exceeds 0.8 at a specificity of 0.95. We found that the significance level of the difference between gene networks of biological processes and random networks is determined by the type of connections considered between objects. At the same time, the highest reliability is achieved for the generalized form of connections that takes into account all the individual types of connections. By taking examples of the thyroid cancer networks and the apoptosis network, it is demonstrated that key participants in these processes are involved in the interactions of those types by which these networks differ from random ones. FunGeneNet is a web tool aimed at proving the functionality of networks in a wide range of sizes of

  10. Neural network CT image reconstruction method for small amount of projection data

    International Nuclear Information System (INIS)

    Ma, X.F.; Fukuhara, M.; Takeda, T.

    2000-01-01

    This paper presents a new method for two-dimensional image reconstruction by using a multi-layer neural network. Though a conventionally used object function of such a neural network is composed of a sum of squared errors of the output data, we define an object function composed of a sum of squared residuals of an integral equation. By employing an appropriate numerical line integral for this integral equation, we can construct a neural network which can be used for CT image reconstruction for cases with small amount of projection data. We applied this method to some model problems and obtained satisfactory results. This method is especially useful for analyses of laboratory experiments or field observations where only a small amount of projection data is available in comparison with the well-developed medical applications

  11. Neural network CT image reconstruction method for small amount of projection data

    CERN Document Server

    Ma, X F; Takeda, T

    2000-01-01

    This paper presents a new method for two-dimensional image reconstruction by using a multi-layer neural network. Though a conventionally used object function of such a neural network is composed of a sum of squared errors of the output data, we define an object function composed of a sum of squared residuals of an integral equation. By employing an appropriate numerical line integral for this integral equation, we can construct a neural network which can be used for CT image reconstruction for cases with small amount of projection data. We applied this method to some model problems and obtained satisfactory results. This method is especially useful for analyses of laboratory experiments or field observations where only a small amount of projection data is available in comparison with the well-developed medical applications.

  12. Hopfield neural network in HEP track reconstruction

    International Nuclear Information System (INIS)

    Muresan, Raluca; Pentia, Mircea

    1996-01-01

    This work uses neural network technique (Hopfield method) to reconstruct particle tracks starting from a data set obtained with a coordinate detector system placed around a high energy accelerated particle interaction region. A learning algorithm for finding the optimal connection of the signal points have been elaborated and tested. We used a single layer neutral network with constraints in order to obtain the particle tracks drawn through the detected signal points. The dynamics of the systems is given by the MFT equations which determine the system evolution to a minimum energy function. We carried out a computing program that has been tested on a lot of Monte Carlo simulated data. With this program we obtained good results even for noise/signal ratio 200. (authors)

  13. Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.

    Science.gov (United States)

    Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

    2004-03-01

    By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.

  14. Reconstruction of source location in a network of gravitational wave interferometric detectors

    International Nuclear Information System (INIS)

    Cavalier, Fabien; Barsuglia, Matteo; Bizouard, Marie-Anne; Brisson, Violette; Clapson, Andre-Claude; Davier, Michel; Hello, Patrice; Kreckelbergh, Stephane; Leroy, Nicolas; Varvella, Monica

    2006-01-01

    This paper deals with the reconstruction of the direction of a gravitational wave source using the detection made by a network of interferometric detectors, mainly the LIGO and Virgo detectors. We suppose that an event has been seen in coincidence using a filter applied on the three detector data streams. Using the arrival time (and its associated error) of the gravitational signal in each detector, the direction of the source in the sky is computed using a χ 2 minimization technique. For reasonably large signals (SNR>4.5 in all detectors), the mean angular error between the real location and the reconstructed one is about 1 deg. . We also investigate the effect of the network geometry assuming the same angular response for all interferometric detectors. It appears that the reconstruction quality is not uniform over the sky and is degraded when the source approaches the plane defined by the three detectors. Adding at least one other detector to the LIGO-Virgo network reduces the blind regions and in the case of 6 detectors, a precision less than 1 deg. on the source direction can be reached for 99% of the sky

  15. Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases

    Directory of Open Access Journals (Sweden)

    Ma'ayan Avi

    2007-10-01

    Full Text Available Abstract Background In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP, generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Results Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Conclusion Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.

  16. Learning Gene Regulatory Networks Computationally from Gene Expression Data Using Weighted Consensus

    KAUST Repository

    Fujii, Chisato

    2015-04-16

    Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.

  17. Combinatorial explosion in model gene networks

    Science.gov (United States)

    Edwards, R.; Glass, L.

    2000-09-01

    The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such

  18. Networks in biological systems: An investigation of the Gene Ontology as an evolving network

    International Nuclear Information System (INIS)

    Coronnello, C; Tumminello, M; Micciche, S; Mantegna, R.N.

    2009-01-01

    Many biological systems can be described as networks where different elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology.

  19. The parallel implementation of a backpropagation neural network and its applicability to SPECT image reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Kerr, John Patrick [Iowa State Univ., Ames, IA (United States)

    1992-01-01

    The objective of this study was to determine the feasibility of using an Artificial Neural Network (ANN), in particular a backpropagation ANN, to improve the speed and quality of the reconstruction of three-dimensional SPECT (single photon emission computed tomography) images. In addition, since the processing elements (PE)s in each layer of an ANN are independent of each other, the speed and efficiency of the neural network architecture could be better optimized by implementing the ANN on a massively parallel computer. The specific goals of this research were: to implement a fully interconnected backpropagation neural network on a serial computer and a SIMD parallel computer, to identify any reduction in the time required to train these networks on the parallel machine versus the serial machine, to determine if these neural networks can learn to recognize SPECT data by training them on a section of an actual SPECT image, and to determine from the knowledge obtained in this research if full SPECT image reconstruction by an ANN implemented on a parallel computer is feasible both in time required to train the network, and in quality of the images reconstructed.

  20. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

    2016-01-01

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  1. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato

    2016-08-25

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  2. Learning a Markov Logic network for supervised gene regulatory network inference.

    Science.gov (United States)

    Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence

    2013-09-12

    Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a

  3. Mutated Genes in Schizophrenia Map to Brain Networks

    Science.gov (United States)

    ... Matters NIH Research Matters August 12, 2013 Mutated Genes in Schizophrenia Map to Brain Networks Schizophrenia networks ... have a high number of spontaneous mutations in genes that form a network in the front region ...

  4. Resistance Genes in Global Crop Breeding Networks.

    Science.gov (United States)

    Garrett, K A; Andersen, K F; Asche, F; Bowden, R L; Forbes, G A; Kulakow, P A; Zhou, B

    2017-10-01

    Resistance genes are a major tool for managing crop diseases. The networks of crop breeders who exchange resistance genes and deploy them in varieties help to determine the global landscape of resistance and epidemics, an important system for maintaining food security. These networks function as a complex adaptive system, with associated strengths and vulnerabilities, and implications for policies to support resistance gene deployment strategies. Extensions of epidemic network analysis can be used to evaluate the multilayer agricultural networks that support and influence crop breeding networks. Here, we evaluate the general structure of crop breeding networks for cassava, potato, rice, and wheat. All four are clustered due to phytosanitary and intellectual property regulations, and linked through CGIAR hubs. Cassava networks primarily include public breeding groups, whereas others are more mixed. These systems must adapt to global change in climate and land use, the emergence of new diseases, and disruptive breeding technologies. Research priorities to support policy include how best to maintain both diversity and redundancy in the roles played by individual crop breeding groups (public versus private and global versus local), and how best to manage connectivity to optimize resistance gene deployment while avoiding risks to the useful life of resistance genes. [Formula: see text] Copyright © 2017 The Author(s). This is an open access article distributed under the CC BY 4.0 International license .

  5. BRAIN NETWORKS. Correlated gene expression supports synchronous activity in brain networks.

    Science.gov (United States)

    Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D

    2015-06-12

    During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.

  6. Multiscale Embedded Gene Co-expression Network Analysis.

    Directory of Open Access Journals (Sweden)

    Won-Min Song

    2015-11-01

    Full Text Available Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3, the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA by: i introducing quality control of co-expression similarities, ii parallelizing embedded network construction, and iii developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs. We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA. MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  7. Multiscale Embedded Gene Co-expression Network Analysis.

    Science.gov (United States)

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  8. Engine cylinder pressure reconstruction using crank kinematics and recurrently-trained neural networks

    Science.gov (United States)

    Bennett, C.; Dunne, J. F.; Trimby, S.; Richardson, D.

    2017-02-01

    A recurrent non-linear autoregressive with exogenous input (NARX) neural network is proposed, and a suitable fully-recurrent training methodology is adapted and tuned, for reconstructing cylinder pressure in multi-cylinder IC engines using measured crank kinematics. This type of indirect sensing is important for cost effective closed-loop combustion control and for On-Board Diagnostics. The challenge addressed is to accurately predict cylinder pressure traces within the cycle under generalisation conditions: i.e. using data not previously seen by the network during training. This involves direct construction and calibration of a suitable inverse crank dynamic model, which owing to singular behaviour at top-dead-centre (TDC), has proved difficult via physical model construction, calibration, and inversion. The NARX architecture is specialised and adapted to cylinder pressure reconstruction, using a fully-recurrent training methodology which is needed because the alternatives are too slow and unreliable for practical network training on production engines. The fully-recurrent Robust Adaptive Gradient Descent (RAGD) algorithm, is tuned initially using synthesised crank kinematics, and then tested on real engine data to assess the reconstruction capability. Real data is obtained from a 1.125 l, 3-cylinder, in-line, direct injection spark ignition (DISI) engine involving synchronised measurements of crank kinematics and cylinder pressure across a range of steady-state speed and load conditions. The paper shows that a RAGD-trained NARX network using both crank velocity and crank acceleration as input information, provides fast and robust training. By using the optimum epoch identified during RAGD training, acceptably accurate cylinder pressures, and especially accurate location-of-peak-pressure, can be reconstructed robustly under generalisation conditions, making it the most practical NARX configuration and recurrent training methodology for use on production engines.

  9. A copula method for modeling directional dependence of genes

    Directory of Open Access Journals (Sweden)

    Park Changyi

    2008-05-01

    Full Text Available Abstract Background Genes interact with each other as basic building blocks of life, forming a complicated network. The relationship between groups of genes with different functions can be represented as gene networks. With the deposition of huge microarray data sets in public domains, study on gene networking is now possible. In recent years, there has been an increasing interest in the reconstruction of gene networks from gene expression data. Recent work includes linear models, Boolean network models, and Bayesian networks. Among them, Bayesian networks seem to be the most effective in constructing gene networks. A major problem with the Bayesian network approach is the excessive computational time. This problem is due to the interactive feature of the method that requires large search space. Since fitting a model by using the copulas does not require iterations, elicitation of the priors, and complicated calculations of posterior distributions, the need for reference to extensive search spaces can be eliminated leading to manageable computational affords. Bayesian network approach produces a discretely expression of conditional probabilities. Discreteness of the characteristics is not required in the copula approach which involves use of uniform representation of the continuous random variables. Our method is able to overcome the limitation of Bayesian network method for gene-gene interaction, i.e. information loss due to binary transformation. Results We analyzed the gene interactions for two gene data sets (one group is eight histone genes and the other group is 19 genes which include DNA polymerases, DNA helicase, type B cyclin genes, DNA primases, radiation sensitive genes, repaire related genes, replication protein A encoding gene, DNA replication initiation factor, securin gene, nucleosome assembly factor, and a subunit of the cohesin complex by adopting a measure of directional dependence based on a copula function. We have compared

  10. Bio-crude transcriptomics: Gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa*

    Directory of Open Access Journals (Sweden)

    Molnár István

    2012-10-01

    Full Text Available Abstract Background Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy. Results A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated. Conclusions The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome

  11. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  12. Functional Module Analysis for Gene Coexpression Networks with Network Integration.

    Science.gov (United States)

    Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K

    2015-01-01

    Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.

  13. Automated Identification of Core Regulatory Genes in Human Gene Regulatory Networks.

    Directory of Open Access Journals (Sweden)

    Vipin Narang

    Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.

  14. Transcriptional regulation of the carbohydrate utilization network in Thermotoga maritima

    Directory of Open Access Journals (Sweden)

    Dmitry A Rodionov

    2013-08-01

    Full Text Available Hyperthermophilic bacteria from the Thermotogales lineage can produce hydrogen by fermenting a wide range of carbohydrates. Previous experimental studies identified a large fraction of genes committed to carbohydrate degradation and utilization in the model bacterium Thermotoga maritima. Knowledge of these genes enabled comprehensive reconstruction of biochemical pathways comprising the carbohydrate utilization network. However, transcriptional factors (TFs and regulatory mechanisms driving this network remained largely unknown. Here, we used an integrated approach based on comparative analysis of genomic and transcriptomic data for the reconstruction of the carbohydrate utilization regulatory networks in 11 Thermotogales genomes. We identified DNA-binding motifs and regulons for 19 orthologous TFs in the Thermotogales. The inferred regulatory network in T. maritima contains 181 genes encoding TFs, sugar catabolic enzymes and ABC-family transporters. In contrast to many previously described bacteria, a transcriptional regulation strategy of Thermotoga does not employ global regulatory factors. The reconstructed regulatory network in T. maritima was validated by gene expression profiling on a panel of mono- and disaccharides and by in vitro DNA-binding assays. The observed upregulation of genes involved in catabolism of pectin, trehalose, cellobiose, arabinose, rhamnose, xylose, glucose, galactose, and ribose showed a strong correlation with the UxaR, TreR, BglR, CelR, AraR, RhaR, XylR, GluR, GalR, and RbsR regulons. Ultimately, this study elucidated the transcriptional regulatory network and mechanisms controlling expression of carbohydrate utilization genes in T. maritima. In addition to improving the functional annotations of associated transporters and catabolic enzymes, this research provides novel insights into the evolution of regulatory networks in Thermotogales.

  15. Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

    Science.gov (United States)

    Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

    2012-01-01

    Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to

  16. A method of reconstructing the spatial measurement network by mobile measurement transmitter for shipbuilding

    International Nuclear Information System (INIS)

    Guo, Siyang; Lin, Jiarui; Yang, Linghui; Ren, Yongjie; Guo, Yin

    2017-01-01

    The workshop Measurement Position System (wMPS) is a distributed measurement system which is suitable for the large-scale metrology. However, there are some inevitable measurement problems in the shipbuilding industry, such as the restriction by obstacles and limited measurement range. To deal with these factors, this paper presents a method of reconstructing the spatial measurement network by mobile transmitter. A high-precision coordinate control network with more than six target points is established. The mobile measuring transmitter can be added into the measurement network using this coordinate control network with the spatial resection method. This method reconstructs the measurement network and broadens the measurement scope efficiently. To verify this method, two comparison experiments are designed with the laser tracker as the reference. The results demonstrate that the accuracy of point-to-point length is better than 0.4mm and the accuracy of coordinate measurement is better than 0.6mm. (paper)

  17. Genetic Network Programming with Reconstructed Individuals

    Science.gov (United States)

    Ye, Fengming; Mabu, Shingo; Wang, Lutao; Eto, Shinji; Hirasawa, Kotaro

    A lot of research on evolutionary computation has been done and some significant classical methods such as Genetic Algorithm (GA), Genetic Programming (GP), Evolutionary Programming (EP), and Evolution Strategies (ES) have been studied. Recently, a new approach named Genetic Network Programming (GNP) has been proposed. GNP can evolve itself and find the optimal solution. It is based on the idea of Genetic Algorithm and uses the data structure of directed graphs. Many papers have demonstrated that GNP can deal with complex problems in the dynamic environments very efficiently and effectively. As a result, recently, GNP is getting more and more attentions and is used in many different areas such as data mining, extracting trading rules of stock markets, elevator supervised control systems, etc., and GNP has obtained some outstanding results. Focusing on the GNP's distinguished expression ability of the graph structure, this paper proposes a method named Genetic Network Programming with Reconstructed Individuals (GNP-RI). The aim of GNP-RI is to balance the exploitation and exploration of GNP, that is, to strengthen the exploitation ability by using the exploited information extensively during the evolution process of GNP and finally obtain better performances than that of GNP. In the proposed method, the worse individuals are reconstructed and enhanced by the elite information before undergoing genetic operations (mutation and crossover). The enhancement of worse individuals mimics the maturing phenomenon in nature, where bad individuals can become smarter after receiving a good education. In this paper, GNP-RI is applied to the tile-world problem which is an excellent bench mark for evaluating the proposed architecture. The performance of GNP-RI is compared with that of the conventional GNP. The simulation results show some advantages of GNP-RI demonstrating its superiority over the conventional GNPs.

  18. Mutational robustness of gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Aalt D J van Dijk

    Full Text Available Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor-target gene interactions but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive. In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.

  19. Reconstruction of gastric slow wave from finger photoplethysmographic signal using radial basis function neural network.

    Science.gov (United States)

    Mohamed Yacin, S; Srinivasa Chakravarthy, V; Manivannan, M

    2011-11-01

    Extraction of extra-cardiac information from photoplethysmography (PPG) signal is a challenging research problem with significant clinical applications. In this study, radial basis function neural network (RBFNN) is used to reconstruct the gastric myoelectric activity (GMA) slow wave from finger PPG signal. Finger PPG and GMA (measured using Electrogastrogram, EGG) signals were acquired simultaneously at the sampling rate of 100 Hz from ten healthy subjects. Discrete wavelet transform (DWT) was used to extract slow wave (0-0.1953 Hz) component from the finger PPG signal; this slow wave PPG was used to reconstruct EGG. A RBFNN is trained on signals obtained from six subjects in both fasting and postprandial conditions. The trained network is tested on data obtained from the remaining four subjects. In the earlier study, we have shown the presence of GMA information in finger PPG signal using DWT and cross-correlation method. In this study, we explicitly reconstruct gastric slow wave from finger PPG signal by the proposed RBFNN-based method. It was found that the network-reconstructed slow wave provided significantly higher (P wave than the correlation obtained (â‰0.7) between the PPG slow wave from DWT and the EEG slow wave. Our results showed that a simple finger PPG signal can be used to reconstruct gastric slow wave using RBFNN method.

  20. Learning gene networks under SNP perturbations using eQTL datasets.

    Directory of Open Access Journals (Sweden)

    Lingxue Zhang

    2014-02-01

    Full Text Available The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network

  1. A network of genes, genetic disorders, and brain areas.

    Directory of Open Access Journals (Sweden)

    Satoru Hayasaka

    Full Text Available The network-based approach has been used to describe the relationship among genes and various phenotypes, producing a network describing complex biological relationships. Such networks can be constructed by aggregating previously reported associations in the literature from various databases. In this work, we applied the network-based approach to investigate how different brain areas are associated to genetic disorders and genes. In particular, a tripartite network with genes, genetic diseases, and brain areas was constructed based on the associations among them reported in the literature through text mining. In the resulting network, a disproportionately large number of gene-disease and disease-brain associations were attributed to a small subset of genes, diseases, and brain areas. Furthermore, a small number of brain areas were found to be associated with a large number of the same genes and diseases. These core brain regions encompassed the areas identified by the previous genome-wide association studies, and suggest potential areas of focus in the future imaging genetics research. The approach outlined in this work demonstrates the utility of the network-based approach in studying genetic effects on the brain.

  2. Convergent evolution of gene networks by single-gene duplications in higher eukaryotes

    OpenAIRE

    Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

    2004-01-01

    By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix–loop–helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks e...

  3. Discovering disease-associated genes in weighted protein-protein interaction networks

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Stanley, H. Eugene

    2018-04-01

    Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.

  4. Reconstruction of neutron spectra through neural networks; Reconstruccion de espectros de neutrones mediante redes neuronales

    Energy Technology Data Exchange (ETDEWEB)

    Vega C, H.R.; Hernandez D, V.M.; Manzanares A, E. [Cuerpo Academico de Radiobiologia, Estudios Nucleares, Universidad Autonoma de Zacatecas, A.P. 336, 98000 Zacatecas (Mexico)] e-mail: rvega@cantera.reduaz.mx [and others

    2003-07-01

    A neural network has been used to reconstruct the neutron spectra starting from the counting rates of the detectors of the Bonner sphere spectrophotometric system. A group of 56 neutron spectra was selected to calculate the counting rates that would produce in a Bonner sphere system, with these data and the spectra it was trained the neural network. To prove the performance of the net, 12 spectra were used, 6 were taken of the group used for the training, 3 were obtained of mathematical functions and those other 3 correspond to real spectra. When comparing the original spectra of those reconstructed by the net we find that our net has a poor performance when reconstructing monoenergetic spectra, this attributes it to those characteristic of the spectra used for the training of the neural network, however for the other groups of spectra the results of the net are appropriate with the prospective ones. (Author)

  5. Network Reconstruction From High-Dimensional Ordinary Differential Equations.

    Science.gov (United States)

    Chen, Shizhe; Shojaie, Ali; Witten, Daniela M

    2017-01-01

    We consider the task of learning a dynamical system from high-dimensional time-course data. For instance, we might wish to estimate a gene regulatory network from gene expression data measured at discrete time points. We model the dynamical system nonparametrically as a system of additive ordinary differential equations. Most existing methods for parameter estimation in ordinary differential equations estimate the derivatives from noisy observations. This is known to be challenging and inefficient. We propose a novel approach that does not involve derivative estimation. We show that the proposed method can consistently recover the true network structure even in high dimensions, and we demonstrate empirical improvement over competing approaches. Supplementary materials for this article are available online.

  6. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    Science.gov (United States)

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  7. A Kalman-filter based approach to identification of time-varying gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Jie Xiong

    Full Text Available MOTIVATION: Conventional identification methods for gene regulatory networks (GRNs have overwhelmingly adopted static topology models, which remains unchanged over time to represent the underlying molecular interactions of a biological system. However, GRNs are dynamic in response to physiological and environmental changes. Although there is a rich literature in modeling static or temporally invariant networks, how to systematically recover these temporally changing networks remains a major and significant pressing challenge. The purpose of this study is to suggest a two-step strategy that recovers time-varying GRNs. RESULTS: It is suggested in this paper to utilize a switching auto-regressive model to describe the dynamics of time-varying GRNs, and a two-step strategy is proposed to recover the structure of time-varying GRNs. In the first step, the change points are detected by a Kalman-filter based method. The observed time series are divided into several segments using these detection results; and each time series segment belonging to two successive demarcating change points is associated with an individual static regulatory network. In the second step, conditional network structure identification methods are used to reconstruct the topology for each time interval. This two-step strategy efficiently decouples the change point detection problem and the topology inference problem. Simulation results show that the proposed strategy can detect the change points precisely and recover each individual topology structure effectively. Moreover, computation results with the developmental data of Drosophila Melanogaster show that the proposed change point detection procedure is also able to work effectively in real world applications and the change point estimation accuracy exceeds other existing approaches, which means the suggested strategy may also be helpful in solving actual GRN reconstruction problem.

  8. PROVIDING OF SAFETY AT WORKS IMPLEMENTATION ON RECONSTRUCTION OF PLUMBINGS NETWORKS IN THE STRAITENED TERMS

    Directory of Open Access Journals (Sweden)

    DIDENKO L. M.

    2016-07-01

    Full Text Available Summary. Raising of problem. In all regions of our country plumbings networks have a considerable physical and moral wear, because in the majority they were laid in the middle of the last century. It is known that more than 50 % on-the-road pipelines are made from steel, here middle tenure of employment of metallic pipes for plumbings networks makes 30. [1]. Statistical data testify that more than 34 % plumbings and sewage networks are in the emergency state. Thus, a large enough stake in building industry of Ukraine is on works on the reconstruction of this type of engineering networks. Thus complete replacement of all pipes requires heavy material tolls, a reconstruction and major repairs of separate emergency areas are mainly produced on this account. Logically to assert that providing of safe production of the examined type of works becomes complicated by the presence of harmful and dangerous productive factors arising up due to the complex factor of straitened. This factor is stipulated by that plumbings networks are laid within the limits of folded municipal building and on territory of operating industrial enterprises. About the danger of production of works on a reconstruction the high level of traumatism testifies at their production. According to the law of Ukraine "On a labour (item 13 protection", an employer is under an obligation to create in the workplace the terms of labour accordingly normatively - to the legal acts, requirements of legislation on the observance of rights of workers in area of labour protection. [2] Providing of safety at implementation of works on the reconstruction of plumbings networks, maybe only at the complex going near the study of this problem, that plugs in itself: research of influence of factors of straitened; exposure of features of technology of production building, assembling, breaking-down, earthen and other types of works executable on a site area at a reconstruction; perfection of existent

  9. MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers.

    Science.gov (United States)

    Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier; Lecompte, Odile

    2017-06-16

    The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user's specific interests and provides an efficient way to share information with collaborators. Furthermore, the user's behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. ©Alexis Allot, Kirsley Chennen, Yannis

  10. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    2017-09-01

    Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.

  11. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

    Science.gov (United States)

    Uddin, Raihan; Singh, Shiva M

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they

  12. Sieve-based relation extraction of gene regulatory networks from biological literature.

    Science.gov (United States)

    Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

    2015-01-01

    Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming

  13. Fused Regression for Multi-source Gene Regulatory Network Inference.

    Directory of Open Access Journals (Sweden)

    Kari Y Lam

    2016-12-01

    Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.

  14. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E.; Re, Matteo

    2014-01-01

    Objective In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. Materials and methods We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. Results The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Conclusions Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further

  15. Applying Bayesian neural networks to event reconstruction in reactor neutrino experiments

    International Nuclear Information System (INIS)

    Xu Ye; Xu Weiwei; Meng Yixiong; Zhu Kaien; Xu Wei

    2008-01-01

    A toy detector has been designed to simulate central detectors in reactor neutrino experiments in the paper. The electron samples from the Monte-Carlo simulation of the toy detector have been reconstructed by the method of Bayesian neural networks (BNNs) and the standard algorithm, a maximum likelihood method (MLD), respectively. The result of the event reconstruction using BNN has been compared with the one using MLD. Compared to MLD, the uncertainties of the electron vertex are not improved, but the energy resolutions are significantly improved using BNN. And the improvement is more obvious for the high energy electrons than the low energy ones

  16. Reveal genes functionally associated with ACADS by a network study.

    Science.gov (United States)

    Chen, Yulong; Su, Zhiguang

    2015-09-15

    Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Review of Biological Network Data and Its Applications

    Directory of Open Access Journals (Sweden)

    Donghyeon Yu

    2013-12-01

    Full Text Available Studying biological networks, such as protein-protein interactions, is key to understanding complex biological activities. Various types of large-scale biological datasets have been collected and analyzed with high-throughput technologies, including DNA microarray, next-generation sequencing, and the two-hybrid screening system, for this purpose. In this review, we focus on network-based approaches that help in understanding biological systems and identifying biological functions. Accordingly, this paper covers two major topics in network biology: reconstruction of gene regulatory networks and network-based applications, including protein function prediction, disease gene prioritization, and network-based genome-wide association study.

  18. Integration of biological networks and gene expression data using Cytoscape

    DEFF Research Database (Denmark)

    Cline, M.S.; Smoot, M.; Cerami, E.

    2007-01-01

    of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules......Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context...... and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape....

  19. Transcriptional delay stabilizes bistable gene networks.

    Science.gov (United States)

    Gupta, Chinmaya; López, José Manuel; Ott, William; Josić, Krešimir; Bennett, Matthew R

    2013-08-02

    Transcriptional delay can significantly impact the dynamics of gene networks. Here we examine how such delay affects bistable systems. We investigate several stochastic models of bistable gene networks and find that increasing delay dramatically increases the mean residence times near stable states. To explain this, we introduce a non-Markovian, analytically tractable reduced model. The model shows that stabilization is the consequence of an increased number of failed transitions between stable states. Each of the bistable systems that we simulate behaves in this manner.

  20. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    Science.gov (United States)

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Gene regulation is governed by a core network in hepatocellular carcinoma.

    Science.gov (United States)

    Gu, Zuguang; Zhang, Chenyu; Wang, Jin

    2012-05-01

    Hepatocellular carcinoma (HCC) is one of the most lethal cancers worldwide, and the mechanisms that lead to the disease are still relatively unclear. However, with the development of high-throughput technologies it is possible to gain a systematic view of biological systems to enhance the understanding of the roles of genes associated with HCC. Thus, analysis of the mechanism of molecule interactions in the context of gene regulatory networks can reveal specific sub-networks that lead to the development of HCC. In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our method for constructing gene regulatory network is based on predicted target interactions, experimentally-supported interactions, and co-expression model. Regulators in the network included both transcription factors and microRNAs to provide a complete view of gene regulation. Analysis of gene regulatory network revealed that gene regulation in HCC is highly modular, in which different sets of regulators take charge of specific biological processes. We found that microRNAs mainly control biological functions related to mitochondria and oxidative reduction, while transcription factors control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core network that organizes regulations between different modules and maintains the robustness of the whole network. There is direct experimental evidence for most of the regulators in the core gene regulatory network relating to HCC. We infer it is the central controller of gene regulation. Finally, we explored the influence of the core gene regulatory network on biological pathways. Our analysis provides insights into the mechanism of transcriptional and post-transcriptional control in HCC. In particular, we highlight the importance of the core gene regulatory network; we propose that it is highly related to HCC and we believe further

  2. Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks.

    Science.gov (United States)

    Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

    2017-11-28

    Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.

  3. Predictive networks: a flexible, open source, web application for integration and analysis of human gene networks.

    Science.gov (United States)

    Haibe-Kains, Benjamin; Olsen, Catharina; Djebbari, Amira; Bontempi, Gianluca; Correll, Mick; Bouton, Christopher; Quackenbush, John

    2012-01-01

    Genomics provided us with an unprecedented quantity of data on the genes that are activated or repressed in a wide range of phenotypes. We have increasingly come to recognize that defining the networks and pathways underlying these phenotypes requires both the integration of multiple data types and the development of advanced computational methods to infer relationships between the genes and to estimate the predictive power of the networks through which they interact. To address these issues we have developed Predictive Networks (PN), a flexible, open-source, web-based application and data services framework that enables the integration, navigation, visualization and analysis of gene interaction networks. The primary goal of PN is to allow biomedical researchers to evaluate experimentally derived gene lists in the context of large-scale gene interaction networks. The PN analytical pipeline involves two key steps. The first is the collection of a comprehensive set of known gene interactions derived from a variety of publicly available sources. The second is to use these 'known' interactions together with gene expression data to infer robust gene networks. The PN web application is accessible from http://predictivenetworks.org. The PN code base is freely available at https://sourceforge.net/projects/predictivenets/.

  4. Combining many interaction networks to predict gene function and analyze gene lists.

    Science.gov (United States)

    Mostafavi, Sara; Morris, Quaid

    2012-05-01

    In this article, we review how interaction networks can be used alone or in combination in an automated fashion to provide insight into gene and protein function. We describe the concept of a "gene-recommender system" that can be applied to any large collection of interaction networks to make predictions about gene or protein function based on a query list of proteins that share a function of interest. We discuss these systems in general and focus on one specific system, GeneMANIA, that has unique features and uses different algorithms from the majority of other systems. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    Science.gov (United States)

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both

  6. The integration of weighted human gene association networks based on link prediction.

    Science.gov (United States)

    Yang, Jian; Yang, Tinghong; Wu, Duzhi; Lin, Limei; Yang, Fan; Zhao, Jing

    2017-01-31

    Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise. Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network. The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.

  7. EGFR Signal-Network Reconstruction Demonstrates Metabolic Crosstalk in EMT.

    Directory of Open Access Journals (Sweden)

    Kumari Sonal Choudhary

    2016-06-01

    Full Text Available Epithelial to mesenchymal transition (EMT is an important event during development and cancer metastasis. There is limited understanding of the metabolic alterations that give rise to and take place during EMT. Dysregulation of signalling pathways that impact metabolism, including epidermal growth factor receptor (EGFR, are however a hallmark of EMT and metastasis. In this study, we report the investigation into EGFR signalling and metabolic crosstalk of EMT through constraint-based modelling and analysis of the breast epithelial EMT cell model D492 and its mesenchymal counterpart D492M. We built an EGFR signalling network for EMT based on stoichiometric coefficients and constrained the network with gene expression data to build epithelial (EGFR_E and mesenchymal (EGFR_M networks. Metabolic alterations arising from differential expression of EGFR genes was derived from a literature review of AKT regulated metabolic genes. Signaling flux differences between EGFR_E and EGFR_M models subsequently allowed metabolism in D492 and D492M cells to be assessed. Higher flux within AKT pathway in the D492 cells compared to D492M suggested higher glycolytic activity in D492 that we confirmed experimentally through measurements of glucose uptake and lactate secretion rates. The signaling genes from the AKT, RAS/MAPK and CaM pathways were predicted to revert D492M to D492 phenotype. Follow-up analysis of EGFR signaling metabolic crosstalk in three additional breast epithelial cell lines highlighted variability in in vitro cell models of EMT. This study shows that the metabolic phenotype may be predicted by in silico analyses of gene expression data of EGFR signaling genes, but this phenomenon is cell-specific and does not follow a simple trend.

  8. EGFR Signal-Network Reconstruction Demonstrates Metabolic Crosstalk in EMT.

    Science.gov (United States)

    Choudhary, Kumari Sonal; Rohatgi, Neha; Halldorsson, Skarphedinn; Briem, Eirikur; Gudjonsson, Thorarinn; Gudmundsson, Steinn; Rolfsson, Ottar

    2016-06-01

    Epithelial to mesenchymal transition (EMT) is an important event during development and cancer metastasis. There is limited understanding of the metabolic alterations that give rise to and take place during EMT. Dysregulation of signalling pathways that impact metabolism, including epidermal growth factor receptor (EGFR), are however a hallmark of EMT and metastasis. In this study, we report the investigation into EGFR signalling and metabolic crosstalk of EMT through constraint-based modelling and analysis of the breast epithelial EMT cell model D492 and its mesenchymal counterpart D492M. We built an EGFR signalling network for EMT based on stoichiometric coefficients and constrained the network with gene expression data to build epithelial (EGFR_E) and mesenchymal (EGFR_M) networks. Metabolic alterations arising from differential expression of EGFR genes was derived from a literature review of AKT regulated metabolic genes. Signaling flux differences between EGFR_E and EGFR_M models subsequently allowed metabolism in D492 and D492M cells to be assessed. Higher flux within AKT pathway in the D492 cells compared to D492M suggested higher glycolytic activity in D492 that we confirmed experimentally through measurements of glucose uptake and lactate secretion rates. The signaling genes from the AKT, RAS/MAPK and CaM pathways were predicted to revert D492M to D492 phenotype. Follow-up analysis of EGFR signaling metabolic crosstalk in three additional breast epithelial cell lines highlighted variability in in vitro cell models of EMT. This study shows that the metabolic phenotype may be predicted by in silico analyses of gene expression data of EGFR signaling genes, but this phenomenon is cell-specific and does not follow a simple trend.

  9. Intrinsic challenges in ancient microbiome reconstruction using 16S rRNA gene amplification.

    Science.gov (United States)

    Ziesemer, Kirsten A; Mann, Allison E; Sankaranarayanan, Krithivasan; Schroeder, Hannes; Ozga, Andrew T; Brandt, Bernd W; Zaura, Egija; Waters-Rist, Andrea; Hoogland, Menno; Salazar-GarcĂ­a, Domingo C; Aldenderfer, Mark; Speller, Camilla; Hendy, Jessica; Weston, Darlene A; MacDonald, Sandy J; Thomas, Gavin H; Collins, Matthew J; Lewis, Cecil M; Hofman, Corinne; Warinner, Christina

    2015-11-13

    To date, characterization of ancient oral (dental calculus) and gut (coprolite) microbiota has been primarily accomplished through a metataxonomic approach involving targeted amplification of one or more variable regions in the 16S rRNA gene. Specifically, the V3 region (E. coli 341-534) of this gene has been suggested as an excellent candidate for ancient DNA amplification and microbial community reconstruction. However, in practice this metataxonomic approach often produces highly skewed taxonomic frequency data. In this study, we use non-targeted (shotgun metagenomics) sequencing methods to better understand skewed microbial profiles observed in four ancient dental calculus specimens previously analyzed by amplicon sequencing. Through comparisons of microbial taxonomic counts from paired amplicon (V3 U341F/534R) and shotgun sequencing datasets, we demonstrate that extensive length polymorphisms in the V3 region are a consistent and major cause of differential amplification leading to taxonomic bias in ancient microbiome reconstructions based on amplicon sequencing. We conclude that systematic amplification bias confounds attempts to accurately reconstruct microbiome taxonomic profiles from 16S rRNA V3 amplicon data generated using universal primers. Because in silico analysis indicates that alternative 16S rRNA hypervariable regions will present similar challenges, we advocate for the use of a shotgun metagenomics approach in ancient microbiome reconstructions.

  10. Speech reconstruction using a deep partially supervised neural network.

    Science.gov (United States)

    McLoughlin, Ian; Li, Jingjie; Song, Yan; Sharifzadeh, Hamid R

    2017-08-01

    Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art.

  11. A Self-Reconstructing Algorithm for Single and Multiple-Sensor Fault Isolation Based on Auto-Associative Neural Networks

    Directory of Open Access Journals (Sweden)

    Hamidreza Mousavi

    2017-01-01

    Full Text Available Recently different approaches have been developed in the field of sensor fault diagnostics based on Auto-Associative Neural Network (AANN. In this paper we present a novel algorithm called Self reconstructing Auto-Associative Neural Network (S-AANN which is able to detect and isolate single faulty sensor via reconstruction. We have also extended the algorithm to be applicable in multiple fault conditions. The algorithm uses a calibration model based on AANN. AANN can reconstruct the faulty sensor using non-faulty sensors due to correlation between the process variables, and mean of the difference between reconstructed and original data determines which sensors are faulty. The algorithms are tested on a Dimerization process. The simulation results show that the S-AANN can isolate multiple faulty sensors with low computational time that make the algorithm appropriate candidate for online applications.

  12. Paper-based synthetic gene networks.

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

    2014-11-06

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.

  13. Paper-based Synthetic Gene Networks

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  14. Inferring network topology from complex dynamics

    International Nuclear Information System (INIS)

    Shandilya, Srinivas Gorur; Timme, Marc

    2011-01-01

    Inferring the network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method for inferring the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is hardly restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology from observing a time series of state variables only. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstructing both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of external noise that distorts the original dynamics substantially. The method provides a conceptually new step towards reconstructing a variety of real-world networks, including gene and protein interaction networks and neuronal circuits.

  15. A Maximum Parsimony Model to Reconstruct Phylogenetic Network in Honey Bee Evolution

    OpenAIRE

    Usha Chouhan; K. R. Pardasani

    2007-01-01

    Phylogenies ; The evolutionary histories of groups of species are one of the most widely used tools throughout the life sciences, as well as objects of research with in systematic, evolutionary biology. In every phylogenetic analysis reconstruction produces trees. These trees represent the evolutionary histories of many groups of organisms, bacteria due to horizontal gene transfer and plants due to process of hybridization. The process of gene transfer in bacteria and hyb...

  16. Reconstructing the Evolutionary History of Paralogous APETALA1/FRUITFULL-Like Genes in Grasses (Poaceae)

    Science.gov (United States)

    Preston, Jill C.; Kellogg, Elizabeth A.

    2006-01-01

    Gene duplication is an important mechanism for the generation of evolutionary novelty. Paralogous genes that are not silenced may evolve new functions (neofunctionalization) that will alter the developmental outcome of preexisting genetic pathways, partition ancestral functions (subfunctionalization) into divergent developmental modules, or function redundantly. Functional divergence can occur by changes in the spatio-temporal patterns of gene expression and/or by changes in the activities of their protein products. We reconstructed the evolutionary history of two paralogous monocot MADS-box transcription factors, FUL1 and FUL2, and determined the evolution of sequence and gene expression in grass AP1/FUL-like genes. Monocot AP1/FUL-like genes duplicated at the base of Poaceae and codon substitutions occurred under relaxed selection mostly along the branch leading to FUL2. Following the duplication, FUL1 was apparently lost from early diverging taxa, a pattern consistent with major changes in grass floral morphology. Overlapping gene expression patterns in leaves and spikelets indicate that FUL1 and FUL2 probably share some redundant functions, but that FUL2 may have become temporally restricted under partial subfunctionalization to particular stages of floret development. These data have allowed us to reconstruct the history of AP1/FUL-like genes in Poaceae and to hypothesize a role for this gene duplication in the evolution of the grass spikelet. PMID:16816429

  17. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    Science.gov (United States)

    Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN)....

  18. Inferring time-varying network topologies from gene expression data.

    Science.gov (United States)

    Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas

    2007-01-01

    Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.

  19. Identification of human disease genes from interactome network using graphlet interaction.

    Directory of Open Access Journals (Sweden)

    Xiao-Dong Wang

    Full Text Available Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes.

  20. Identification of Human Disease Genes from Interactome Network Using Graphlet Interaction

    Science.gov (United States)

    Yang, Lun; Wei, Dong-Qing; Qi, Ying-Xin; Jiang, Zong-Lai

    2014-01-01

    Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes. PMID:24465923

  1. Relaxation rates of gene expression kinetics reveal the feedback signs of autoregulatory gene networks

    Science.gov (United States)

    Jia, Chen; Qian, Hong; Chen, Min; Zhang, Michael Q.

    2018-03-01

    The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression.

  2. A flood-based information flow analysis and network minimization method for gene regulatory networks.

    Science.gov (United States)

    Pavlogiannis, Andreas; Mozhayskiy, Vadim; Tagkopoulos, Ilias

    2013-04-24

    Biological networks tend to have high interconnectivity, complex topologies and multiple types of interactions. This renders difficult the identification of sub-networks that are involved in condition- specific responses. In addition, we generally lack scalable methods that can reveal the information flow in gene regulatory and biochemical pathways. Doing so will help us to identify key participants and paths under specific environmental and cellular context. This paper introduces the theory of network flooding, which aims to address the problem of network minimization and regulatory information flow in gene regulatory networks. Given a regulatory biological network, a set of source (input) nodes and optionally a set of sink (output) nodes, our task is to find (a) the minimal sub-network that encodes the regulatory program involving all input and output nodes and (b) the information flow from the source to the sink nodes of the network. Here, we describe a novel, scalable, network traversal algorithm and we assess its potential to achieve significant network size reduction in both synthetic and E. coli networks. Scalability and sensitivity analysis show that the proposed method scales well with the size of the network, and is robust to noise and missing data. The method of network flooding proves to be a useful, practical approach towards information flow analysis in gene regulatory networks. Further extension of the proposed theory has the potential to lead in a unifying framework for the simultaneous network minimization and information flow analysis across various "omics" levels.

  3. System Biology Approach: Gene Network Analysis for Muscular Dystrophy.

    Science.gov (United States)

    Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro

    2018-01-01

    Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.

  4. A hybrid network-based method for the detection of disease-related genes

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

    2018-02-01

    Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

  5. Dense Matching Comparison Between Census and a Convolutional Neural Network Algorithm for Plant Reconstruction

    Science.gov (United States)

    Xia, Y.; Tian, J.; d'Angelo, P.; Reinartz, P.

    2018-05-01

    3D reconstruction of plants is hard to implement, as the complex leaf distribution highly increases the difficulty level in dense matching. Semi-Global Matching has been successfully applied to recover the depth information of a scene, but may perform variably when different matching cost algorithms are used. In this paper two matching cost computation algorithms, Census transform and an algorithm using a convolutional neural network, are tested for plant reconstruction based on Semi-Global Matching. High resolution close-range photogrammetric images from a handheld camera are used for the experiment. The disparity maps generated based on the two selected matching cost methods are comparable with acceptable quality, which shows the good performance of Census and the potential of neural networks to improve the dense matching.

  6. Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon

    Directory of Open Access Journals (Sweden)

    Satoru Koda

    2017-11-01

    Full Text Available We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX model with a group smoothly clipped absolute deviation (SCAD method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon. To reveal the diurnal changes in the transcriptome in B. distachyon, we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon. On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon, aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.

  7. The integration of weighted gene association networks based on information entropy.

    Science.gov (United States)

    Yang, Fan; Wu, Duzhi; Lin, Limei; Yang, Jian; Yang, Tinghong; Zhao, Jing

    2017-01-01

    Constructing genome scale weighted gene association networks (WGAN) from multiple data sources is one of research hot spots in systems biology. In this paper, we employ information entropy to describe the uncertain degree of gene-gene links and propose a strategy for data integration of weighted networks. We use this method to integrate four existing human weighted gene association networks and construct a much larger WGAN, which includes richer biology information while still keeps high functional relevance between linked gene pairs. The new WGAN shows satisfactory performance in disease gene prediction, which suggests the reliability of our integration strategy. Compared with existing integration methods, our method takes the advantage of the inherent characteristics of the component networks and pays less attention to the biology background of the data. It can make full use of existing biological networks with low computational effort.

  8. Spatiotemporal network motif reveals the biological traits of developmental gene regulatory networks in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Kim Man-Sun

    2012-05-01

    Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.

  9. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    Science.gov (United States)

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  10. Novel Plasmodium falciparum metabolic network reconstruction identifies shifts associated with clinical antimalarial resistance.

    Science.gov (United States)

    Carey, Maureen A; Papin, Jason A; Guler, Jennifer L

    2017-07-19

    Malaria remains a major public health burden and resistance has emerged to every antimalarial on the market, including the frontline drug, artemisinin. Our limited understanding of Plasmodium biology hinders the elucidation of resistance mechanisms. In this regard, systems biology approaches can facilitate the integration of existing experimental knowledge and further understanding of these mechanisms. Here, we developed a novel genome-scale metabolic network reconstruction, iPfal17, of the asexual blood-stage P. falciparum parasite to expand our understanding of metabolic changes that support resistance. We identified 11 metabolic tasks to evaluate iPfal17 performance. Flux balance analysis and simulation of gene knockouts and enzyme inhibition predict candidate drug targets unique to resistant parasites. Moreover, integration of clinical parasite transcriptomes into the iPfal17 reconstruction reveals patterns associated with antimalarial resistance. These results predict that artemisinin sensitive and resistant parasites differentially utilize scavenging and biosynthetic pathways for multiple essential metabolites, including folate and polyamines. Our findings are consistent with experimental literature, while generating novel hypotheses about artemisinin resistance and parasite biology. We detect evidence that resistant parasites maintain greater metabolic flexibility, perhaps representing an incomplete transition to the metabolic state most appropriate for nutrient-rich blood. Using this systems biology approach, we identify metabolic shifts that arise with or in support of the resistant phenotype. This perspective allows us to more productively analyze and interpret clinical expression data for the identification of candidate drug targets for the treatment of resistant parasites.

  11. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  12. A relative variation-based method to unraveling gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Yali Wang

    Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be

  13. Reconstruction of biological networks based on life science data integration.

    Science.gov (United States)

    Kormeier, Benjamin; Hippe, Klaus; Arrigo, Patrizio; Töpel, Thoralf; Janowski, Sebastian; Hofestädt, Ralf

    2010-10-27

    For the implementation of the virtual cell, the fundamental question is how to model and simulate complex biological networks. Therefore, based on relevant molecular database and information systems, biological data integration is an essential step in constructing biological networks. In this paper, we will motivate the applications BioDWH--an integration toolkit for building life science data warehouses, CardioVINEdb--a information system for biological data in cardiovascular-disease and VANESA--a network editor for modeling and simulation of biological networks. Based on this integration process, the system supports the generation of biological network models. A case study of a cardiovascular-disease related gene-regulated biological network is also presented.

  14. EnzDP: improved enzyme annotation for metabolic network reconstruction based on domain composition profiles.

    Science.gov (United States)

    Nguyen, Nam-Ninh; Srihari, Sriganesh; Leong, Hon Wai; Chong, Ket-Fah

    2015-10-01

    Determining the entire complement of enzymes and their enzymatic functions is a fundamental step for reconstructing the metabolic network of cells. High quality enzyme annotation helps in enhancing metabolic networks reconstructed from the genome, especially by reducing gaps and increasing the enzyme coverage. Currently, structure-based and network-based approaches can only cover a limited number of enzyme families, and the accuracy of homology-based approaches can be further improved. Bottom-up homology-based approach improves the coverage by rebuilding Hidden Markov Model (HMM) profiles for all known enzymes. However, its clustering procedure relies firmly on BLAST similarity score, ignoring protein domains/patterns, and is sensitive to changes in cut-off thresholds. Here, we use functional domain architecture to score the association between domain families and enzyme families (Domain-Enzyme Association Scoring, DEAS). The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score. We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites. Our analysis shows that our stringent protocol EnzDP can cover up to 90% of enzyme families available in Swiss-Prot. It achieves a high accuracy of 94.5% based on five-fold cross-validation. EnzDP outperforms existing methods across several testing scenarios. Thus, EnzDP serves as a reliable automated tool for enzyme annotation and metabolic network reconstruction. Available at: www.comp.nus.edu.sg/~nguyennn/EnzDP .

  15. REGEN: Ancestral Genome Reconstruction for Bacteria

    OpenAIRE

    Yang, Kuan; Heath, Lenwood S.; Setubal, JoĂŁo C.

    2012-01-01

    Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deleti...

  16. Tissue-specific functional networks for prioritizing phenotype and disease genes.

    Directory of Open Access Journals (Sweden)

    Yuanfang Guan

    Full Text Available Integrated analyses of functional genomics data have enormous potential for identifying phenotype-associated genes. Tissue-specificity is an important aspect of many genetic diseases, reflecting the potentially different roles of proteins and pathways in diverse cell lineages. Accounting for tissue specificity in global integration of functional genomics data is challenging, as "functionality" and "functional relationships" are often not resolved for specific tissue types. We address this challenge by generating tissue-specific functional networks, which can effectively represent the diversity of protein function for more accurate identification of phenotype-associated genes in the laboratory mouse. Specifically, we created 107 tissue-specific functional relationship networks through integration of genomic data utilizing knowledge of tissue-specific gene expression patterns. Cross-network comparison revealed significantly changed genes enriched for functions related to specific tissue development. We then utilized these tissue-specific networks to predict genes associated with different phenotypes. Our results demonstrate that prediction performance is significantly improved through using the tissue-specific networks as compared to the global functional network. We used a testis-specific functional relationship network to predict genes associated with male fertility and spermatogenesis phenotypes, and experimentally confirmed one top prediction, Mbyl1. We then focused on a less-common genetic disease, ataxia, and identified candidates uniquely predicted by the cerebellum network, which are supported by both literature and experimental evidence. Our systems-level, tissue-specific scheme advances over traditional global integration and analyses and establishes a prototype to address the tissue-specific effects of genetic perturbations, diseases and drugs.

  17. Image reconstruction using Monte Carlo simulation and artificial neural networks

    International Nuclear Information System (INIS)

    Emert, F.; Missimner, J.; Blass, W.; Rodriguez, A.

    1997-01-01

    PET data sets are subject to two types of distortions during acquisition: the imperfect response of the scanner and attenuation and scattering in the active distribution. In addition, the reconstruction of voxel images from the line projections composing a data set can introduce artifacts. Monte Carlo simulation provides a means for modeling the distortions and artificial neural networks a method for correcting for them as well as minimizing artifacts. (author) figs., tab., refs

  18. Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.

    Science.gov (United States)

    Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert

    2018-03-02

    The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and

  19. REGEN: Ancestral Genome Reconstruction for Bacteria.

    Science.gov (United States)

    Yang, Kuan; Heath, Lenwood S; Setubal, JoĂŁo C

    2012-07-18

    Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deletion and replicon fission and fusion. The reconstruction can be performed by either a maximum parsimony or a maximum likelihood method. Gene content reconstruction is based on the concept of neighboring gene pairs. REGEN was designed to be used with any set of genomes that are sufficiently related, which will usually be the case for bacteria within the same taxonomic order. We evaluated REGEN using simulated genomes and genomes in the Rhizobiales order.

  20. A pathway-based network analysis of hypertension-related genes

    Science.gov (United States)

    Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng

    2016-02-01

    Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.

  1. Uncovering co-expression gene network modules regulating fruit acidity in diverse apples.

    Science.gov (United States)

    Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong

    2015-08-16

    Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P acidity. Overall, this study provides important insight into the Ma1-mediated gene network controlling acidity in mature apple fruit of diverse genetic background.

  2. DENSE MATCHING COMPARISON BETWEEN CENSUS AND A CONVOLUTIONAL NEURAL NETWORK ALGORITHM FOR PLANT RECONSTRUCTION

    Directory of Open Access Journals (Sweden)

    Y. Xia

    2018-05-01

    Full Text Available 3D reconstruction of plants is hard to implement, as the complex leaf distribution highly increases the difficulty level in dense matching. Semi-Global Matching has been successfully applied to recover the depth information of a scene, but may perform variably when different matching cost algorithms are used. In this paper two matching cost computation algorithms, Census transform and an algorithm using a convolutional neural network, are tested for plant reconstruction based on Semi-Global Matching. High resolution close-range photogrammetric images from a handheld camera are used for the experiment. The disparity maps generated based on the two selected matching cost methods are comparable with acceptable quality, which shows the good performance of Census and the potential of neural networks to improve the dense matching.

  3. RegnANN: Reverse Engineering Gene Networks using Artificial Neural Networks.

    Directory of Open Access Journals (Sweden)

    Marco Grimaldi

    Full Text Available RegnANN is a novel method for reverse engineering gene networks based on an ensemble of multilayer perceptrons. The algorithm builds a regressor for each gene in the network, estimating its neighborhood independently. The overall network is obtained by joining all the neighborhoods. RegnANN makes no assumptions about the nature of the relationships between the variables, potentially capturing high-order and non linear dependencies between expression patterns. The evaluation focuses on synthetic data mimicking plausible submodules of larger networks and on biological data consisting of submodules of Escherichia coli. We consider Barabasi and Erdös-Rényi topologies together with two methods for data generation. We verify the effect of factors such as network size and amount of data to the accuracy of the inference algorithm. The accuracy scores obtained with RegnANN is methodically compared with the performance of three reference algorithms: ARACNE, CLR and KELLER. Our evaluation indicates that RegnANN compares favorably with the inference methods tested. The robustness of RegnANN, its ability to discover second order correlations and the agreement between results obtained with this new methods on both synthetic and biological data are promising and they stimulate its application to a wider range of problems.

  4. Reconstruction of biological networks based on life science data integration

    Directory of Open Access Journals (Sweden)

    Kormeier Benjamin

    2010-06-01

    Full Text Available For the implementation of the virtual cell, the fundamental question is how to model and simulate complex biological networks. Therefore, based on relevant molecular database and information systems, biological data integration is an essential step in constructing biological networks. In this paper, we will motivate the applications BioDWH - an integration toolkit for building life science data warehouses, CardioVINEdb - a information system for biological data in cardiovascular-disease and VANESA- a network editor for modeling and simulation of biological networks. Based on this integration process, the system supports the generation of biological network models. A case study of a cardiovascular-disease related gene-regulated biological network is also presented.

  5. Identification of putative orthologous genes for the phylogenetic reconstruction of temperate woody bamboos (Poaceae: Bambusoideae).

    Science.gov (United States)

    Zhang, Li-Na; Zhang, Xian-Zhi; Zhang, Yu-Xiao; Zeng, Chun-Xia; Ma, Peng-Fei; Zhao, Lei; Guo, Zhen-Hua; Li, De-Zhu

    2014-09-01

    The temperate woody bamboos (Arundinarieae) are highly diverse in morphology but lack a substantial amount of genetic variation. The taxonomy of this lineage is intractable, and the relationships within the tribe have not been well resolved. Recent studies indicated that this tribe could have a complex evolutionary history. Although phylogenetic studies of the tribe have been carried out, most of these phylogenetic reconstructions were based on plastid data, which provide lower phylogenetic resolution compared with nuclear data. In this study, we intended to identify a set of desirable nuclear genes for resolving the phylogeny of the temperate woody bamboos. Using two different methodologies, we identified 209 and 916 genes, respectively, as putative single copy orthologous genes. A total of 112 genes was successfully amplified and sequenced by next-generation sequencing technologies in five species sampled from the tribe. As most of the genes exhibited intra-individual allele heterozygotes, we investigated phylogenetic utility by reconstructing the phylogeny based on individual genes. Discordance among gene trees was observed and, to resolve the conflict, we performed a range of analyses using BUCKy and HybTree. While caution should be taken when inferring a phylogeny from multiple conflicting genes, our analysis indicated that 74 of the 112 investigated genes are potential markers for resolving the phylogeny of the temperate woody bamboos. © 2014 John Wiley & Sons Ltd.

  6. REGEN: Ancestral Genome Reconstruction for Bacteria

    Directory of Open Access Journals (Sweden)

    JoĂŁo C. Setubal

    2012-07-01

    Full Text Available Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deletion and replicon fission and fusion. The reconstruction can be performed by either a maximum parsimony or a maximum likelihood method. Gene content reconstruction is based on the concept of neighboring gene pairs. REGEN was designed to be used with any set of genomes that are sufficiently related, which will usually be the case for bacteria within the same taxonomic order. We evaluated REGEN using simulated genomes and genomes in the Rhizobiales order.

  7. Hybrid stochastic simplifications for multiscale gene networks

    Directory of Open Access Journals (Sweden)

    Debussche Arnaud

    2009-09-01

    Full Text Available Abstract Background Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. Results We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion 123 which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Conclusion Hybrid simplifications can be used for onion-like (multi-layered approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach.

  8. Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer.

    Science.gov (United States)

    Reddy, Anupama; Huang, C Chris; Liu, Huiqing; Delisi, Charles; Nevalainen, Marja T; Szalma, Sandor; Bhanot, Gyan

    2010-01-01

    We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily

  9. Signal reconstruction in wireless sensor networks based on a cubature Kalman particle filter

    International Nuclear Information System (INIS)

    Huang Jin-Wang; Feng Jiu-Chao

    2014-01-01

    For solving the issues of the signal reconstruction of nonlinear non-Gaussian signals in wireless sensor networks (WSNs), a new signal reconstruction algorithm based on a cubature Kalman particle filter (CKPF) is proposed in this paper. We model the reconstruction signal first and then use the CKPF to estimate the signal. The CKPF uses a cubature Kalman filter (CKF) to generate the importance proposal distribution of the particle filter and integrates the latest observation, which can approximate the true posterior distribution better. It can improve the estimation accuracy. CKPF uses fewer cubature points than the unscented Kalman particle filter (UKPF) and has less computational overheads. Meanwhile, CKPF uses the square root of the error covariance for iterating and is more stable and accurate than the UKPF counterpart. Simulation results show that the algorithm can reconstruct the observed signals quickly and effectively, at the same time consuming less computational time and with more accuracy than the method based on UKPF. (general)

  10. Deconstructing the pluripotency gene regulatory network

    KAUST Repository

    Li, Mo

    2018-04-04

    Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.

  11. Deconstructing the pluripotency gene regulatory network

    KAUST Repository

    Li, Mo; Belmonte, Juan Carlos Izpisua

    2018-01-01

    Pluripotent stem cells can be isolated from embryos or derived by reprogramming. Pluripotency is stabilized by an interconnected network of pluripotency genes that cooperatively regulate gene expression. Here we describe the molecular principles of pluripotency gene function and highlight post-transcriptional controls, particularly those induced by RNA-binding proteins and alternative splicing, as an important regulatory layer of pluripotency. We also discuss heterogeneity in pluripotency regulation, alternative pluripotency states and future directions of pluripotent stem cell research.

  12. Transcriptomic network analysis of micronuclei-related genes: a case study

    DEFF Research Database (Denmark)

    van Leeuwen, D. M.; Pedersen, Marie; Knudsen, Lisbeth E.

    2011-01-01

    checkpoint and aneuploidy. The MN-related gene network was tested against a transcriptomics case study associated with MN measurements. In this case study, transcriptomic data from children and adults differentially exposed to ambient air pollution in the Czech Republic were analysed and visualised......Mechanistically relevant information on responses of humans to xenobiotic exposure in relation to chemically induced biological effects, such as micronuclei (MN) formation can be obtained through large-scale transcriptomics studies. Network analysis may enhance the analysis and visualisation...... of such data. Therefore, this study aimed to develop a 'MN formation' network based on a priori knowledge, by using the pathway tool MetaCore. The gene network contained 27 genes and three gene complexes that are related to processes involved in MN formation, e.g. spindle assembly checkpoint, cell cycle...

  13. Synchronous versus asynchronous modeling of gene regulatory networks.

    Science.gov (United States)

    Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni

    2008-09-01

    In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.

  14. SELANSI: a toolbox for simulation of stochastic gene regulatory networks.

    Science.gov (United States)

    Pájaro, Manuel; Otero-Muras, Irene; Vázquez, Carlos; Alonso, Antonio A

    2018-03-01

    Gene regulation is inherently stochastic. In many applications concerning Systems and Synthetic Biology such as the reverse engineering and the de novo design of genetic circuits, stochastic effects (yet potentially crucial) are often neglected due to the high computational cost of stochastic simulations. With advances in these fields there is an increasing need of tools providing accurate approximations of the stochastic dynamics of gene regulatory networks (GRNs) with reduced computational effort. This work presents SELANSI (SEmi-LAgrangian SImulation of GRNs), a software toolbox for the simulation of stochastic multidimensional gene regulatory networks. SELANSI exploits intrinsic structural properties of gene regulatory networks to accurately approximate the corresponding Chemical Master Equation with a partial integral differential equation that is solved by a semi-lagrangian method with high efficiency. Networks under consideration might involve multiple genes with self and cross regulations, in which genes can be regulated by different transcription factors. Moreover, the validity of the method is not restricted to a particular type of kinetics. The tool offers total flexibility regarding network topology, kinetics and parameterization, as well as simulation options. SELANSI runs under the MATLAB environment, and is available under GPLv3 license at https://sites.google.com/view/selansi. antonio@iim.csic.es. © The Author(s) 2017. Published by Oxford University Press.

  15. Network Graph Analysis of Gene-Gene Interactions in Genome-Wide Association Study Data

    Directory of Open Access Journals (Sweden)

    Sungyoung Lee

    2012-12-01

    Full Text Available Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs. For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR is one of the powerful and efficient methods for detecting high-order gene-gene (GxG interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI. Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  16. Network graph analysis of gene-gene interactions in genome-wide association study data.

    Science.gov (United States)

    Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

    2012-12-01

    Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene (GxG) interactions. However, the biological interpretation of GxG interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified GxG interactions. The proposed network graph analysis consists of three steps. The first step is for performing GxG interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified GxG interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform GxG interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified GxG interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of GxG interactions.

  17. From Gene Trees to a Dated Allopolyploid Network: Insights from the Angiosperm Genus Viola (Violaceae)

    Science.gov (United States)

    Marcussen, Thomas; Heier, Lise; Brysting, Anne K.; Oxelman, Bengt; Jakobsen, Kjetill S.

    2015-01-01

    Allopolyploidization accounts for a significant fraction of speciation events in many eukaryotic lineages. However, existing phylogenetic and dating methods require tree-like topologies and are unable to handle the network-like phylogenetic relationships of lineages containing allopolyploids. No explicit framework has so far been established for evaluating competing network topologies, and few attempts have been made to date phylogenetic networks. We used a four-step approach to generate a dated polyploid species network for the cosmopolitan angiosperm genus Viola L. (Violaceae Batch.). The genus contains ca 600 species and both recent (neo-) and more ancient (meso-) polyploid lineages distributed over 16 sections. First, we obtained DNA sequences of three low-copy nuclear genes and one chloroplast region, from 42 species representing all 16 sections. Second, we obtained fossil-calibrated chronograms for each nuclear gene marker. Third, we determined the most parsimonious multilabeled genome tree and its corresponding network, resolved at the section (not the species) level. Reconstructing the “correct” network for a set of polyploids depends on recovering all homoeologs, i.e., all subgenomes, in these polyploids. Assuming the presence of Viola subgenome lineages that were not detected by the nuclear gene phylogenies (“ghost subgenome lineages”) significantly reduced the number of inferred polyploidization events. We identified the most parsimonious network topology from a set of five competing scenarios differing in the interpretation of homoeolog extinctions and lineage sorting, based on (i) fewest possible ghost subgenome lineages, (ii) fewest possible polyploidization events, and (iii) least possible deviation from expected ploidy as inferred from available chromosome counts of the involved polyploid taxa. Finally, we estimated the homoploid and polyploid speciation times of the most parsimonious network. Homoploid speciation times were estimated by

  18. Exploring candidate biological functions by Boolean Function Networks for Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Maria Simak

    Full Text Available The great amount of gene expression data has brought a big challenge for the discovery of Gene Regulatory Network (GRN. For network reconstruction and the investigation of regulatory relations, it is desirable to ensure directness of links between genes on a map, infer their directionality and explore candidate biological functions from high-throughput transcriptomic data. To address these problems, we introduce a Boolean Function Network (BFN model based on techniques of hidden Markov model (HMM, likelihood ratio test and Boolean logic functions. BFN consists of two consecutive tests to establish links between pairs of genes and check their directness. We evaluate the performance of BFN through the application to S. cerevisiae time course data. BFN produces regulatory relations which show consistency with succession of cell cycle phases. Furthermore, it also improves sensitivity and specificity when compared with alternative methods of genetic network reverse engineering. Moreover, we demonstrate that BFN can provide proper resolution for GO enrichment of gene sets. Finally, the Boolean functions discovered by BFN can provide useful insights for the identification of control mechanisms of regulatory processes, which is the special advantage of the proposed approach. In combination with low computational complexity, BFN can serve as an efficient screening tool to reconstruct genes relations on the whole genome level. In addition, the BFN approach is also feasible to a wide range of time course datasets.

  19. A Regulatory Network Analysis of Orphan Genes in Arabidopsis Thaliana

    Science.gov (United States)

    Singh, Pramesh; Chen, Tianlong; Arendsee, Zebulun; Wurtele, Eve S.; Bassler, Kevin E.

    Orphan genes, which are genes unique to each particular species, have recently drawn significant attention for their potential usefulness for organismal robustness. Their origin and regulatory interaction patterns remain largely undiscovered. Recently, methods that use the context likelihood of relatedness to infer a network followed by modularity maximizing community detection algorithms on the inferred network to find the functional structure of regulatory networks were shown to be effective. We apply improved versions of these methods to gene expression data from Arabidopsis thaliana, identify groups (clusters) of interacting genes with related patterns of expression and analyze the structure within those groups. Focusing on clusters that contain orphan genes, we compare the identified clusters to gene ontology (GO) terms, regulons, and pathway designations and analyze their hierarchical structure. We predict new regulatory interactions and unravel the structure of the regulatory interaction patterns of orphan genes. Work supported by the NSF through Grants DMR-1507371 and IOS-1546858.

  20. Network statistics of genetically-driven gene co-expression modules in mouse crosses

    Directory of Open Access Journals (Sweden)

    Marie-Pier eScott-Boyer

    2013-12-01

    Full Text Available In biology, networks are used in different contexts as ways to represent relationships between entities, such as for instance interactions between genes, proteins or metabolites. Despite progress in the analysis of such networks and their potential to better understand the collective impact of genes on complex traits, one remaining challenge is to establish the biologic validity of gene co-expression networks and to determine what governs their organization. We used WGCNA to construct and analyze seven gene expression datasets from several tissues of mouse recombinant inbred strains (RIS. For six out of the 7 networks, we found that linkage to module QTLs (mQTLs could be established for 29.3% of gene co-expression modules detected in the several mouse RIS. For about 74.6% of such genetically-linked modules, the mQTL was on the same chromosome as the one contributing most genes to the module, with genes originating from that chromosome showing higher connectivity than other genes in the modules. Such modules (that we considered as genetically-driven had network statistic properties (density, centralization and heterogeneity that set them apart from other modules in the network. Altogether, a sizeable portion of gene co-expression modules detected in mouse RIS panels had genetic determinants as their main organizing principle. In addition to providing a biologic interpretation validation for these modules, these genetic determinants imparted on them particular properties that set them apart from other modules in the network, to the point that they can be predicted to a large extent on the basis of their network statistics.

  1. A collaborative computing framework of cloud network and WBSN applied to fall detection and 3-D motion reconstruction.

    Science.gov (United States)

    Lai, Chin-Feng; Chen, Min; Pan, Jeng-Shyang; Youn, Chan-Hyun; Chao, Han-Chieh

    2014-03-01

    As cloud computing and wireless body sensor network technologies become gradually developed, ubiquitous healthcare services prevent accidents instantly and effectively, as well as provides relevant information to reduce related processing time and cost. This study proposes a co-processing intermediary framework integrated cloud and wireless body sensor networks, which is mainly applied to fall detection and 3-D motion reconstruction. In this study, the main focuses includes distributed computing and resource allocation of processing sensing data over the computing architecture, network conditions and performance evaluation. Through this framework, the transmissions and computing time of sensing data are reduced to enhance overall performance for the services of fall events detection and 3-D motion reconstruction.

  2. Reconstruction of cellular signal transduction networks using perturbation assays and linear programming.

    Science.gov (United States)

    Knapp, Bettina; Kaderali, Lars

    2013-01-01

    Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4(+) T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.

  3. Inferring the gene network underlying the branching of tomato inflorescence.

    Directory of Open Access Journals (Sweden)

    Laura Astola

    Full Text Available The architecture of tomato inflorescence strongly affects flower production and subsequent crop yield. To understand the genetic activities involved, insight into the underlying network of genes that initiate and control the sympodial growth in the tomato is essential. In this paper, we show how the structure of this network can be derived from available data of the expressions of the involved genes. Our approach starts from employing biological expert knowledge to select the most probable gene candidates behind branching behavior. To find how these genes interact, we develop a stepwise procedure for computational inference of the network structure. Our data consists of expression levels from primary shoot meristems, measured at different developmental stages on three different genotypes of tomato. With the network inferred by our algorithm, we can explain the dynamics corresponding to all three genotypes simultaneously, despite their apparent dissimilarities. We also correctly predict the chronological order of expression peaks for the main hubs in the network. Based on the inferred network, using optimal experimental design criteria, we are able to suggest an informative set of experiments for further investigation of the mechanisms underlying branching behavior.

  4. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Ă–zgĂĽr, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  5. On the dynamics of a gene regulatory network

    International Nuclear Information System (INIS)

    Grammaticos, B; Carstea, A S; Ramani, A

    2006-01-01

    We examine the dynamics of a network of genes focusing on a periodic chain of genes, of arbitrary length. We show that within a given class of sigmoids representing the equilibrium probability of the binding of the RNA polymerase to the core promoter, the system possesses a single stable fixed point. By slightly modifying the sigmoid, introducing 'stiffer' forms, we show that it is possible to find network configurations exhibiting bistable behaviour. Our results do not depend crucially on the length of the chain considered: calculations with finite chains lead to similar results. However, a realistic study of regulatory genetic networks would require the consideration of more complex topologies and interactions

  6. Functional modules by relating protein interaction networks and gene expression.

    Science.gov (United States)

    Tornow, Sabine; Mewes, H W

    2003-11-01

    Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships.

  7. Characterization of differentially expressed genes using high-dimensional co-expression networks

    DEFF Research Database (Denmark)

    Coelho Goncalves de Abreu, Gabriel; Labouriau, Rodrigo S.

    2010-01-01

    We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation...... that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we...... construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than...

  8. Integration of omic networks in a developmental atlas of maize.

    Science.gov (United States)

    Walley, Justin W; Sartor, Ryan C; Shen, Zhouxin; Schmitz, Robert J; Wu, Kevin J; Urich, Mark A; Nery, Joseph R; Smith, Laurie G; Schnable, James C; Ecker, Joseph R; Briggs, Steven P

    2016-08-19

    Coexpression networks and gene regulatory networks (GRNs) are emerging as important tools for predicting functional roles of individual genes at a system-wide scale. To enable network reconstructions, we built a large-scale gene expression atlas composed of 62,547 messenger RNAs (mRNAs), 17,862 nonmodified proteins, and 6227 phosphoproteins harboring 31,595 phosphorylation sites quantified across maize development. Networks in which nodes are genes connected on the basis of highly correlated expression patterns of mRNAs were very different from networks that were based on coexpression of proteins. Roughly 85% of highly interconnected hubs were not conserved in expression between RNA and protein networks. However, networks from either data type were enriched in similar ontological categories and were effective in predicting known regulatory relationships. Integration of mRNA, protein, and phosphoprotein data sets greatly improved the predictive power of GRNs. Copyright © 2016, American Association for the Advancement of Science.

  9. Local and global responses in complex gene regulation networks

    Science.gov (United States)

    Tsuchiya, Masa; Selvarajoo, Kumar; Piras, Vincent; Tomita, Masaru; Giuliani, Alessandro

    2009-04-01

    An exacerbated sensitivity to apparently minor stimuli and a general resilience of the entire system stay together side-by-side in biological systems. This apparent paradox can be explained by the consideration of biological systems as very strongly interconnected network systems. Some nodes of these networks, thanks to their peculiar location in the network architecture, are responsible for the sensitivity aspects, while the large degree of interconnection is at the basis of the resilience properties of the system. One relevant feature of the high degree of connectivity of gene regulation networks is the emergence of collective ordered phenomena influencing the entire genome and not only a specific portion of transcripts. The great majority of existing gene regulation models give the impression of purely local â€hard-wired’ mechanisms disregarding the emergence of global ordered behavior encompassing thousands of genes while the general, genome wide, aspects are less known. Here we address, on a data analysis perspective, the discrimination between local and global scale regulations, this goal was achieved by means of the examination of two biological systems: innate immune response in macrophages and oscillating growth dynamics in yeast. Our aim was to reconcile the â€hard-wired’ local view of gene regulation with a global continuous and scalable one borrowed from statistical physics. This reconciliation is based on the network paradigm in which the local â€hard-wired’ activities correspond to the activation of specific crucial nodes in the regulation network, while the scalable continuous responses can be equated to the collective oscillations of the network after a perturbation.

  10. lpNet: a linear programming approach to reconstruct signal transduction networks.

    Science.gov (United States)

    Matos, Marta R A; Knapp, Bettina; Kaderali, Lars

    2015-10-01

    With the widespread availability of high-throughput experimental technologies it has become possible to study hundreds to thousands of cellular factors simultaneously, such as coding- or non-coding mRNA or protein concentrations. Still, extracting information about the underlying regulatory or signaling interactions from these data remains a difficult challenge. We present a flexible approach towards network inference based on linear programming. Our method reconstructs the interactions of factors from a combination of perturbation/non-perturbation and steady-state/time-series data. We show both on simulated and real data that our methods are able to reconstruct the underlying networks fast and efficiently, thus shedding new light on biological processes and, in particular, into disease's mechanisms of action. We have implemented the approach as an R package available through bioconductor. This R package is freely available under the Gnu Public License (GPL-3) from bioconductor.org (http://bioconductor.org/packages/release/bioc/html/lpNet.html) and is compatible with most operating systems (Windows, Linux, Mac OS) and hardware architectures. bettina.knapp@helmholtz-muenchen.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. The impact of gene expression variation on the robustness and evolvability of a developmental gene regulatory network.

    Directory of Open Access Journals (Sweden)

    David A Garfield

    2013-10-01

    Full Text Available Regulatory interactions buffer development against genetic and environmental perturbations, but adaptation requires phenotypes to change. We investigated the relationship between robustness and evolvability within the gene regulatory network underlying development of the larval skeleton in the sea urchin Strongylocentrotus purpuratus. We find extensive variation in gene expression in this network throughout development in a natural population, some of which has a heritable genetic basis. Switch-like regulatory interactions predominate during early development, buffer expression variation, and may promote the accumulation of cryptic genetic variation affecting early stages. Regulatory interactions during later development are typically more sensitive (linear, allowing variation in expression to affect downstream target genes. Variation in skeletal morphology is associated primarily with expression variation of a few, primarily structural, genes at terminal positions within the network. These results indicate that the position and properties of gene interactions within a network can have important evolutionary consequences independent of their immediate regulatory role.

  12. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

    Science.gov (United States)

    Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

    2015-10-01

    Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.

  13. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    Directory of Open Access Journals (Sweden)

    Joeri Ruyssinck

    Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made

  14. fabp4 is central to eight obesity associated genes: a functional gene network-based polymorphic study.

    Science.gov (United States)

    Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand

    2015-01-07

    Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Coordinations between gene modules control the operation of plant amino acid metabolic networks

    Directory of Open Access Journals (Sweden)

    Galili Gad

    2009-01-01

    Full Text Available Abstract Background Being sessile organisms, plants should adjust their metabolism to dynamic changes in their environment. Such adjustments need particular coordination in branched metabolic networks in which a given metabolite can be converted into multiple other metabolites via different enzymatic chains. In the present report, we developed a novel "Gene Coordination" bioinformatics approach and use it to elucidate adjustable transcriptional interactions of two branched amino acid metabolic networks in plants in response to environmental stresses, using publicly available microarray results. Results Using our "Gene Coordination" approach, we have identified in Arabidopsis plants two oppositely regulated groups of "highly coordinated" genes within the branched Asp-family network of Arabidopsis plants, which metabolizes the amino acids Lys, Met, Thr, Ile and Gly, as well as a single group of "highly coordinated" genes within the branched aromatic amino acid metabolic network, which metabolizes the amino acids Trp, Phe and Tyr. These genes possess highly coordinated adjustable negative and positive expression responses to various stress cues, which apparently regulate adjustable metabolic shifts between competing branches of these networks. We also provide evidence implying that these highly coordinated genes are central to impose intra- and inter-network interactions between the Asp-family and aromatic amino acid metabolic networks as well as differential system interactions with other growth promoting and stress-associated genome-wide genes. Conclusion Our novel Gene Coordination elucidates that branched amino acid metabolic networks in plants are regulated by specific groups of highly coordinated genes that possess adjustable intra-network, inter-network and genome-wide transcriptional interactions. We also hypothesize that such transcriptional interactions enable regulatory metabolic adjustments needed for adaptation to the stresses.

  16. Introduction: Cancer Gene Networks.

    Science.gov (United States)

    Clarke, Robert

    2017-01-01

    Constructing, evaluating, and interpreting gene networks generally sits within the broader field of systems biology, which continues to emerge rapidly, particular with respect to its application to understanding the complexity of signaling in the context of cancer biology. For the purposes of this volume, we take a broad definition of systems biology. Considering an organism or disease within an organism as a system, systems biology is the study of the integrated and coordinated interactions of the network(s) of genes, their variants both natural and mutated (e.g., polymorphisms, rearrangements, alternate splicing, mutations), their proteins and isoforms, and the organic and inorganic molecules with which they interact, to execute the biochemical reactions (e.g., as enzymes, substrates, products) that reflect the function of that system. Central to systems biology, and perhaps the only approach that can effectively manage the complexity of such systems, is the building of quantitative multiscale predictive models. The predictions of the models can vary substantially depending on the nature of the model and its inputoutput relationships. For example, a model may predict the outcome of a specific molecular reaction(s), a cellular phenotype (e.g., alive, dead, growth arrest, proliferation, and motility), a change in the respective prevalence of cell or subpopulations, a patient or patient subgroup outcome(s). Such models necessarily require computers. Computational modeling can be thought of as using machine learning and related tools to integrate the very high dimensional data generated from modern, high throughput omics technologies including genomics (next generation sequencing), transcriptomics (gene expression microarrays; RNAseq), metabolomics and proteomics (ultra high performance liquid chromatography, mass spectrometry), and "subomic" technologies to study the kinome, methylome, and others. Mathematical modeling can be thought of as the use of ordinary

  17. Transcriptional control in the segmentation gene network of Drosophila.

    Directory of Open Access Journals (Sweden)

    Mark D Schroeder

    2004-09-01

    Full Text Available The segmentation gene network of Drosophila consists of maternal and zygotic factors that generate, by transcriptional (cross- regulation, expression patterns of increasing complexity along the anterior-posterior axis of the embryo. Using known binding site information for maternal and zygotic gap transcription factors, the computer algorithm Ahab recovers known segmentation control elements (modules with excellent success and predicts many novel modules within the network and genome-wide. We show that novel module predictions are highly enriched in the network and typically clustered proximal to the promoter, not only upstream, but also in intronic space and downstream. When placed upstream of a reporter gene, they consistently drive patterned blastoderm expression, in most cases faithfully producing one or more pattern elements of the endogenous gene. Moreover, we demonstrate for the entire set of known and newly validated modules that Ahab's prediction of binding sites correlates well with the expression patterns produced by the modules, revealing basic rules governing their composition. Specifically, we show that maternal factors consistently act as activators and that gap factors act as repressors, except for the bimodal factor Hunchback. Our data suggest a simple context-dependent rule for its switch from repressive to activating function. Overall, the composition of modules appears well fitted to the spatiotemporal distribution of their positive and negative input factors. Finally, by comparing Ahab predictions with different categories of transcription factor input, we confirm the global regulatory structure of the segmentation gene network, but find odd skipped behaving like a primary pair-rule gene. The study expands our knowledge of the segmentation gene network by increasing the number of experimentally tested modules by 50%. For the first time, the entire set of validated modules is analyzed for binding site composition under a

  18. Building gene co-expression networks using transcriptomics data for systems biology investigations

    DEFF Research Database (Denmark)

    Kadarmideen, Haja; Watson-Haigh, Nathan S.

    2012-01-01

    Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four......) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT...... (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended....

  19. Construction of Gene Regulatory Networks Using Recurrent Neural Networks and Swarm Intelligence.

    Science.gov (United States)

    Khan, Abhinandan; Mandal, Sudip; Pal, Rajat Kumar; Saha, Goutam

    2016-01-01

    We have proposed a methodology for the reverse engineering of biologically plausible gene regulatory networks from temporal genetic expression data. We have used established information and the fundamental mathematical theory for this purpose. We have employed the Recurrent Neural Network formalism to extract the underlying dynamics present in the time series expression data accurately. We have introduced a new hybrid swarm intelligence framework for the accurate training of the model parameters. The proposed methodology has been first applied to a small artificial network, and the results obtained suggest that it can produce the best results available in the contemporary literature, to the best of our knowledge. Subsequently, we have implemented our proposed framework on experimental (in vivo) datasets. Finally, we have investigated two medium sized genetic networks (in silico) extracted from GeneNetWeaver, to understand how the proposed algorithm scales up with network size. Additionally, we have implemented our proposed algorithm with half the number of time points. The results indicate that a reduction of 50% in the number of time points does not have an effect on the accuracy of the proposed methodology significantly, with a maximum of just over 15% deterioration in the worst case.

  20. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method.

    Science.gov (United States)

    Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui

    2017-10-06

    Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.

  1. Global similarity and local divergence in human and mouse gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Koonin Eugene V

    2006-09-01

    Full Text Available Abstract Background A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. Results At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction ( Conclusion The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.

  2. Network Analysis of Human Genes Influencing Susceptibility to Mycobacterial Infections

    Science.gov (United States)

    Lipner, Ettie M.; Garcia, Benjamin J.; Strong, Michael

    2016-01-01

    Tuberculosis and nontuberculous mycobacterial infections constitute a high burden of pulmonary disease in humans, resulting in over 1.5 million deaths per year. Building on the premise that genetic factors influence the instance, progression, and defense of infectious disease, we undertook a systems biology approach to investigate relationships among genetic factors that may play a role in increased susceptibility or control of mycobacterial infections. We combined literature and database mining with network analysis and pathway enrichment analysis to examine genes, pathways, and networks, involved in the human response to Mycobacterium tuberculosis and nontuberculous mycobacterial infections. This approach allowed us to examine functional relationships among reported genes, and to identify novel genes and enriched pathways that may play a role in mycobacterial susceptibility or control. Our findings suggest that the primary pathways and genes influencing mycobacterial infection control involve an interplay between innate and adaptive immune proteins and pathways. Signaling pathways involved in autoimmune disease were significantly enriched as revealed in our networks. Mycobacterial disease susceptibility networks were also examined within the context of gene-chemical relationships, in order to identify putative drugs and nutrients with potential beneficial immunomodulatory or anti-mycobacterial effects. PMID:26751573

  3. Construction of functional linkage gene networks by data integration.

    Science.gov (United States)

    Linghu, Bolan; Franzosa, Eric A; Xia, Yu

    2013-01-01

    Networks of functional associations between genes have recently been successfully used for gene function and disease-related research. A typical approach for constructing such functional linkage gene networks (FLNs) is based on the integration of diverse high-throughput functional genomics datasets. Data integration is a nontrivial task due to the heterogeneous nature of the different data sources and their variable accuracy and completeness. The presence of correlations between data sources also adds another layer of complexity to the integration process. In this chapter we discuss an approach for constructing a human FLN from data integration and a subsequent application of the FLN to novel disease gene discovery. Similar approaches can be applied to nonhuman species and other discovery tasks.

  4. ICan: an integrated co-alteration network to identify ovarian cancer-related genes.

    Science.gov (United States)

    Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

    2015-01-01

    Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.

  5. An artificial neural network approach to reconstruct the source term of a nuclear accident

    International Nuclear Information System (INIS)

    Giles, J.; Palma, C. R.; Weller, P.

    1997-01-01

    This work makes use of one of the main features of artificial neural networks, which is their ability to 'learn' from sets of known input and output data. Indeed, a trained artificial neural network can be used to make predictions on the input data when the output is known, and this feedback process enables one to reconstruct the source term from field observations. With this aim, an artificial neural networks has been trained, using the projections of a segmented plume atmospheric dispersion model at fixed points, simulating a set of gamma detectors located outside the perimeter of a nuclear facility. The resulting set of artificial neural networks was used to determine the release fraction and rate for each of the noble gases, iodines and particulate fission products that could originate from a nuclear accident. Model projections were made using a large data set consisting of effective release height, release fraction of noble gases, iodines and particulate fission products, atmospheric stability, wind speed and wind direction. The model computed nuclide-specific gamma dose rates. The locations of the detectors were chosen taking into account both building shine and wake effects, and varied in distance between 800 and 1200 m from the reactor.The inputs to the artificial neural networks consisted of the measurements from the detector array, atmospheric stability, wind speed and wind direction; the outputs comprised a set of release fractions and heights. Once trained, the artificial neural networks was used to reconstruct the source term from the detector responses for data sets not used in training. The preliminary results are encouraging and show that the noble gases and particulate fission product release fractions are well determined

  6. Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors

    Directory of Open Access Journals (Sweden)

    Frank Emmert-Streib

    2013-02-01

    Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.

  7. Artificial neural network inference (ANNI: a study on gene-gene interaction for biomarkers in childhood sarcomas.

    Directory of Open Access Journals (Sweden)

    Dong Ling Tong

    Full Text Available OBJECTIVE: To model the potential interaction between previously identified biomarkers in children sarcomas using artificial neural network inference (ANNI. METHOD: To concisely demonstrate the biological interactions between correlated genes in an interaction network map, only 2 types of sarcomas in the children small round blue cell tumors (SRBCTs dataset are discussed in this paper. A backpropagation neural network was used to model the potential interaction between genes. The prediction weights and signal directions were used to model the strengths of the interaction signals and the direction of the interaction link between genes. The ANN model was validated using Monte Carlo cross-validation to minimize the risk of over-fitting and to optimize generalization ability of the model. RESULTS: Strong connection links on certain genes (TNNT1 and FNDC5 in rhabdomyosarcoma (RMS; FCGRT and OLFM1 in Ewing's sarcoma (EWS suggested their potency as central hubs in the interconnection of genes with different functionalities. The results showed that the RMS patients in this dataset are likely to be congenital and at low risk of cardiomyopathy development. The EWS patients are likely to be complicated by EWS-FLI fusion and deficiency in various signaling pathways, including Wnt, Fas/Rho and intracellular oxygen. CONCLUSIONS: The ANN network inference approach and the examination of identified genes in the published literature within the context of the disease highlights the substantial influence of certain genes in sarcomas.

  8. Trimming of mammalian transcriptional networks using network component analysis

    Directory of Open Access Journals (Sweden)

    Liao James C

    2010-10-01

    Full Text Available Abstract Background Network Component Analysis (NCA has been used to deduce the activities of transcription factors (TFs from gene expression data and the TF-gene binding relationship. However, the TF-gene interaction varies in different environmental conditions and tissues, but such information is rarely available and cannot be predicted simply by motif analysis. Thus, it is beneficial to identify key TF-gene interactions under the experimental condition based on transcriptome data. Such information would be useful in identifying key regulatory pathways and gene markers of TFs in further studies. Results We developed an algorithm to trim network connectivity such that the important regulatory interactions between the TFs and the genes were retained and the regulatory signals were deduced. Theoretical studies demonstrated that the regulatory signals were accurately reconstructed even in the case where only three independent transcriptome datasets were available. At least 80% of the main target genes were correctly predicted in the extreme condition of high noise level and small number of datasets. Our algorithm was tested with transcriptome data taken from mice under rapamycin treatment. The initial network topology from the literature contains 70 TFs, 778 genes, and 1423 edges between the TFs and genes. Our method retained 1074 edges (i.e. 75% of the original edge number and identified 17 TFs as being significantly perturbed under the experimental condition. Twelve of these TFs are involved in MAPK signaling or myeloid leukemia pathways defined in the KEGG database, or are known to physically interact with each other. Additionally, four of these TFs, which are Hif1a, Cebpb, Nfkb1, and Atf1, are known targets of rapamycin. Furthermore, the trimmed network was able to predict Eno1 as an important target of Hif1a; this key interaction could not be detected without trimming the regulatory network. Conclusions The advantage of our new algorithm

  9. The gene regulatory network for breast cancer: Integrated regulatory landscape of cancer hallmarks

    Directory of Open Access Journals (Sweden)

    Frank eEmmert-Streib

    2014-02-01

    Full Text Available In this study, we infer the breast cancer gene regulatory network from gene expression data. This network is obtained from the application of the BC3Net inference algorithm to a large-scale gene expression data set consisting of $351$ patient samples. In order to elucidate the functional relevance of the inferred network, we are performing a Gene Ontology (GO analysis for its structural components. Our analysis reveals that most significant GO-terms we find for the breast cancer network represent functional modules of biological processes that are described by known cancer hallmarks, including translation, immune response, cell cycle, organelle fission, mitosis, cell adhesion, RNA processing, RNA splicing and response to wounding. Furthermore, by using a curated list of census cancer genes, we find an enrichment in these functional modules. Finally, we study cooperative effects of chromosomes based on information of interacting genes in the beast cancer network. We find that chromosome $21$ is most coactive with other chromosomes. To our knowledge this is the first study investigating the genome-scale breast cancer network.

  10. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  11. Inferring the conservative causal core of gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2010-09-01

    Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  12. Inferring the conservative causal core of gene regulatory networks.

    Science.gov (United States)

    Altay, Gökmen; Emmert-Streib, Frank

    2010-09-28

    Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.

  13. Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations.

    Directory of Open Access Journals (Sweden)

    Xiaodong Cai

    Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.

  14. Modeling stochasticity and robustness in gene regulatory networks.

    Science.gov (United States)

    Garg, Abhishek; Mohanram, Kartik; Di Cara, Alessandro; De Micheli, Giovanni; Xenarios, Ioannis

    2009-06-15

    Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.

  15. Disease candidate gene identification and prioritization using protein interaction networks

    Directory of Open Access Journals (Sweden)

    Aronow Bruce J

    2009-02-01

    Full Text Available Abstract Background Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN analyses. Results For the first time, extended versions of the PageRank and HITS algorithms, and the K-Step Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds", and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings – for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method – the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.

  16. Semi-supervised prediction of gene regulatory networks using ...

    Indian Academy of Sciences (India)

    2015-09-28

    Sep 28, 2015 ... Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging ... two types of methods differ primarily based on whether ..... negligible, allowing us to draw the qualitative conclusions .... research will be conducted to develop additional biologically.

  17. Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

    Science.gov (United States)

    Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

    2018-04-01

    The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.

  18. Network Completion for Static Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Natsu Nakajima

    2014-01-01

    Full Text Available We tackle the problem of completing and inferring genetic networks under stationary conditions from static data, where network completion is to make the minimum amount of modifications to an initial network so that the completed network is most consistent with the expression data in which addition of edges and deletion of edges are basic modification operations. For this problem, we present a new method for network completion using dynamic programming and least-squares fitting. This method can find an optimal solution in polynomial time if the maximum indegree of the network is bounded by a constant. We evaluate the effectiveness of our method through computational experiments using synthetic data. Furthermore, we demonstrate that our proposed method can distinguish the differences between two types of genetic networks under stationary conditions from lung cancer and normal gene expression data.

  19. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis.

    Science.gov (United States)

    Arsovski, Andrej A; Pradinuk, Julian; Guo, Xu Qiu; Wang, Sishuo; Adams, Keith L

    2015-12-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. © 2015 American Society of Plant Biologists. All Rights Reserved.

  20. A gene regulatory network armature for T-lymphocyte specification

    Energy Technology Data Exchange (ETDEWEB)

    Fung, Elizabeth-sharon [Los Alamos National Laboratory

    2008-01-01

    Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through which T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.

  1. Trends and barriers to lateral gene transfer in prokaryotes.

    Science.gov (United States)

    Popa, Ovidiu; Dagan, Tal

    2011-10-01

    Gene acquisition by lateral gene transfer (LGT) is an important mechanism for natural variation among prokaryotes. Laboratory experiments show that protein-coding genes can be laterally transferred extremely fast among microbial cells, inherited to most of their descendants, and adapt to a new regulatory regime within a short time. Recent advance in the phylogenetic analysis of microbial genomes using networks approach reveals a substantial impact of LGT during microbial genome evolution. Phylogenomic networks of LGT among prokaryotes reconstructed from completely sequenced genomes uncover barriers to LGT in multiple levels. Here we discuss the kinds of barriers to gene acquisition in nature including physical barriers for gene transfer between cells, genomic barriers for the integration of acquired DNA, and functional barriers for the acquisition of new genes. Copyright © 2011 Elsevier Ltd. All rights reserved.

  2. Reconstruction of sparse connectivity in neural networks from spike train covariances

    International Nuclear Information System (INIS)

    Pernice, Volker; Rotter, Stefan

    2013-01-01

    The inference of causation from correlation is in general highly problematic. Correspondingly, it is difficult to infer the existence of physical synaptic connections between neurons from correlations in their activity. Covariances in neural spike trains and their relation to network structure have been the subject of intense research, both experimentally and theoretically. The influence of recurrent connections on covariances can be characterized directly in linear models, where connectivity in the network is described by a matrix of linear coupling kernels. However, as indirect connections also give rise to covariances, the inverse problem of inferring network structure from covariances can generally not be solved unambiguously. Here we study to what degree this ambiguity can be resolved if the sparseness of neural networks is taken into account. To reconstruct a sparse network, we determine the minimal set of linear couplings consistent with the measured covariances by minimizing the L 1 norm of the coupling matrix under appropriate constraints. Contrary to intuition, after stochastic optimization of the coupling matrix, the resulting estimate of the underlying network is directed, despite the fact that a symmetric matrix of count covariances is used for inference. The performance of the new method is best if connections are neither exceedingly sparse, nor too dense, and it is easily applicable for networks of a few hundred nodes. Full coupling kernels can be obtained from the matrix of full covariance functions. We apply our method to networks of leaky integrate-and-fire neurons in an asynchronous–irregular state, where spike train covariances are well described by a linear model. (paper)

  3. On the Interplay between Entropy and Robustness of Gene Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Bor-Sen Chen

    2010-05-01

    Full Text Available The interplay between entropy and robustness of gene network is a core mechanism of systems biology. The entropy is a measure of randomness or disorder of a physical system due to random parameter fluctuation and environmental noises in gene regulatory networks. The robustness of a gene regulatory network, which can be measured as the ability to tolerate the random parameter fluctuation and to attenuate the effect of environmental noise, will be discussed from the robust Hâž stabilization and filtering perspective. In this review, we will also discuss their balancing roles in evolution and potential applications in systems and synthetic biology.

  4. A novel mutual information-based Boolean network inference method from time-series gene expression data.

    Directory of Open Access Journals (Sweden)

    Shohag Barman

    Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.

  5. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining.

    Science.gov (United States)

    Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.

  6. An integer optimization algorithm for robust identification of non-linear gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Chemmangattuvalappil Nishanth

    2012-09-01

    Full Text Available Abstract Background Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction. Results We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters

  7. Gene regulatory networks elucidating huanglongbing disease mechanisms.

    Directory of Open Access Journals (Sweden)

    Federico Martinelli

    Full Text Available Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas, especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein - protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation, sucrose metabolism (upregulation, and starch biosynthesis (upregulation. In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70 was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur.

  8. A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

    Science.gov (United States)

    Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

    2015-07-01

    In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.

  9. FocusHeuristics - expression-data-driven network optimization and disease gene prediction.

    Science.gov (United States)

    Ernst, Mathias; Du, Yang; Warsow, Gregor; Hamed, Mohamed; Endlich, Nicole; Endlich, Karlhans; Murua Escobar, Hugo; Sklarz, Lisa-Madeleine; Sender, Sina; Junghanß, Christian; Möller, Steffen; Fuellen, Georg; Struckmann, Stephan

    2017-02-16

    To identify genes contributing to disease phenotypes remains a challenge for bioinformatics. Static knowledge on biological networks is often combined with the dynamics observed in gene expression levels over disease development, to find markers for diagnostics and therapy, and also putative disease-modulatory drug targets and drugs. The basis of current methods ranges from a focus on expression-levels (Limma) to concentrating on network characteristics (PageRank, HITS/Authority Score), and both (DeMAND, Local Radiality). We present an integrative approach (the FocusHeuristics) that is thoroughly evaluated based on public expression data and molecular disease characteristics provided by DisGeNet. The FocusHeuristics combines three scores, i.e. the log fold change and another two, based on the sum and difference of log fold changes of genes/proteins linked in a network. A gene is kept when one of the scores to which it contributes is above a threshold. Our FocusHeuristics is both, a predictor for gene-disease-association and a bioinformatics method to reduce biological networks to their disease-relevant parts, by highlighting the dynamics observed in expression data. The FocusHeuristics is slightly, but significantly better than other methods by its more successful identification of disease-associated genes measured by AUC, and it delivers mechanistic explanations for its choice of genes.

  10. Statistical indicators of collective behavior and functional clusters in gene networks of yeast

    Science.gov (United States)

    Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

    2006-03-01

    We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.

  11. Identification and network-enabled characterization of auxin response factor genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    David J. Burks

    2016-12-01

    Full Text Available The Auxin Response Factor (ARF family of transcription factors is an important regulator of environmental response and symbiotic nodulation in the legume Medicago truncatula. While previous studies have identified members of this family, a recent spurt in gene expression data coupled with genome update and reannotation calls for a reassessment of the prevalence of ARF genes and their interaction networks in M. truncatula. We performed a comprehensive analysis of the M. truncatula genome and transcriptome that entailed search for novel ARF genes and the co-expression networks. Our investigation revealed 8 novel M. truncatula ARF (MtARF genes, of the total 22 identified, and uncovered novel gene co-expression networks as well. Furthermore, the topological clustering and single enrichment analysis of several network models revealed the roles of individual members of the MtARF family in nitrogen regulation, nodule initiation, and post-embryonic development through a specialized protein packaging and secretory pathway. In summary, this study not just shines new light on an important gene family, but also provides a guideline for identification of new members of gene families and their functional characterization through network analyses.

  12. Unveiling network-based functional features through integration of gene expression into protein networks.

    Science.gov (United States)

    Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali

    2018-06-01

    Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Digital Signal Processing and Control for the Study of Gene Networks

    Science.gov (United States)

    Shin, Yong-Jun

    2016-04-01

    Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.

  14. Continental-Scale Temperature Reconstructions from the PAGES 2k Network

    Science.gov (United States)

    Kaufman, D. S.

    2012-12-01

    We present a major new synthesis of seven regional temperature reconstructions to elucidate the global pattern of variations and their association with climate-forcing mechanisms over the past two millennia. To coordinate the integration of new and existing data of all proxy types, the Past Global Changes (PAGES) project developed the 2k Network. It comprises nine working groups representing eight continental-scale regions and the oceans. The PAGES 2k Consortium, authoring this paper, presently includes 79 representatives from 25 countries. For this synthesis, each of the PAGES 2k working groups identified the proxy climate records for reconstructing past temperature and associated uncertainty using the data and methodologies that they deemed most appropriate for their region. The datasets are from 973 sites where tree rings, pollen, corals, lake and marine sediment, glacier ice, speleothems, and historical documents record changes in biologically and physically mediated processes that are sensitive to temperature change, among other climatic factors. The proxy records used for this synthesis are available through the NOAA World Data Center for Paleoclimatology. On long time scales, the temperature reconstructions display similarities among regions, and a large part of this common behavior can be explained by known climate forcings. Reconstructed temperatures in all regions show an overall long-term cooling trend until around 1900 C.E., followed by strong warming during the 20th century. On the multi-decadal time scale, we assessed the variability among the temperature reconstructions using principal component (PC) analysis of the standardized decadal mean temperatures over the period of overlap among the reconstructions (1200 to 1980 C.E.). PC1 explains 35% of the total variability and is strongly correlated with temperature reconstructions from the four Northern Hemisphere regions, and with the sum of external forcings including solar, volcanic, and greenhouse

  15. Reconstruction of road defects and road roughness classification using vehicle responses with artificial neural networks simulation

    CSIR Research Space (South Africa)

    Ngwangwa, HM

    2010-04-01

    Full Text Available -1 Journal of Terramechanics Volume 47, Issue 2, April 2010, Pages 97-111 Reconstruction of road defects and road roughness classification using vehicle responses with artificial neural networks simulation H.M. Ngwangwaa, P.S. Heynsa, , , F...

  16. Context-specific metabolic networks are consistent with experiments.

    Directory of Open Access Journals (Sweden)

    Scott A Becker

    2008-05-01

    Full Text Available Reconstructions of cellular metabolism are publicly available for a variety of different microorganisms and some mammalian genomes. To date, these reconstructions are "genome-scale" and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence. Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type. Methods to tailor these comprehensive genome-scale reconstructions into context-specific networks will aid predictive in silico modeling for a particular situation. We present a method called Gene Inactivity Moderated by Metabolism and Expression (GIMME to achieve this goal. The GIMME algorithm uses quantitative gene expression data and one or more presupposed metabolic objectives to produce the context-specific reconstruction that is most consistent with the available data. Furthermore, the algorithm provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective. We show that this algorithm produces results consistent with biological experiments and intuition for adaptive evolution of bacteria, rational design of metabolic engineering strains, and human skeletal muscle cells. This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available.

  17. Gene co-expression networks shed light into diseases of brain iron accumulation.

    Science.gov (United States)

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M; Botía, Juan A; Collingwood, Joanna F; Hardy, John; Milward, Elizabeth A; Ryten, Mina; Houlden, Henry

    2016-03-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Cell cycle gene expression networks discovered using systems biology: Significance in carcinogenesis

    Science.gov (United States)

    Scott, RE; Ghule, PN; Stein, JL; Stein, GS

    2015-01-01

    The early stages of carcinogenesis are linked to defects in the cell cycle. A series of cell cycle checkpoints are involved in this process. The G1/S checkpoint that serves to integrate the control of cell proliferation and differentiation is linked to carcinogenesis and the mitotic spindle checkpoint with the development of chromosomal instability. This paper presents the outcome of systems biology studies designed to evaluate if networks of covariate cell cycle gene transcripts exist in proliferative mammalian tissues including mice, rats and humans. The GeneNetwork website that contains numerous gene expression datasets from different species, sexes and tissues represents the foundational resource for these studies (www.genenetwork.org). In addition, WebGestalt, a gene ontology tool, facilitated the identification of expression networks of genes that co-vary with key cell cycle targets, especially Cdc20 and Plk1 (www.bioinfo.vanderbilt.edu/webgestalt). Cell cycle expression networks of such covariate mRNAs exist in multiple proliferative tissues including liver, lung, pituitary, adipose and lymphoid tissues among others but not in brain or retina that have low proliferative potential. Sixty-three covariate cell cycle gene transcripts (mRNAs) compose the average cell cycle network with p = eâ’13 to eâ’36. Cell cycle expression networks show species, sex and tissue variability and they are enriched in mRNA transcripts associated with mitosis many of which are associated with chromosomal instability. PMID:25808367

  19. A reliability index for assessment of crack profile reconstructed from ECT signals using a neural-network approach

    International Nuclear Information System (INIS)

    Yusa, Noritaka; Chen, Zhenmao; Miya, Kenzo; Cheng, Weiying

    2002-01-01

    This paper proposes a reliability parameter to enhance an version scheme developed by authors. The scheme is based upon an artificial neural network that simulates mapping between eddy current signals and crack profiles. One of the biggest advantages of the scheme is that it can deal with conductive cracks, which is necessary to reconstruct natural cracks. However, it has one significant disadvantage: the reliability of reconstructed profiles was unknown. The parameter provides an index for assessment of the crack profile and overcomes this disadvantage. After the parameter is validated by reconstruction of simulated cracks, it is applied to reconstruction of natural cracks that occurred in steam generator tubes of a pressurized water reactor. It is revealed that the parameter is applicable to not only simulated cracks but also natural ones. (author)

  20. In Vitro Reconstruction of Neuronal Networks Derived from Human iPS Cells Using Microfabricated Devices.

    Directory of Open Access Journals (Sweden)

    Yuzo Takayama

    Full Text Available Morphology and function of the nervous system is maintained via well-coordinated processes both in central and peripheral nervous tissues, which govern the homeostasis of organs/tissues. Impairments of the nervous system induce neuronal disorders such as peripheral neuropathy or cardiac arrhythmia. Although further investigation is warranted to reveal the molecular mechanisms of progression in such diseases, appropriate model systems mimicking the patient-specific communication between neurons and organs are not established yet. In this study, we reconstructed the neuronal network in vitro either between neurons of the human induced pluripotent stem (iPS cell derived peripheral nervous system (PNS and central nervous system (CNS, or between PNS neurons and cardiac cells in a morphologically and functionally compartmentalized manner. Networks were constructed in photolithographically microfabricated devices with two culture compartments connected by 20 microtunnels. We confirmed that PNS and CNS neurons connected via synapses and formed a network. Additionally, calcium-imaging experiments showed that the bundles originating from the PNS neurons were functionally active and responded reproducibly to external stimuli. Next, we confirmed that CNS neurons showed an increase in calcium activity during electrical stimulation of networked bundles from PNS neurons in order to demonstrate the formation of functional cell-cell interactions. We also confirmed the formation of synapses between PNS neurons and mature cardiac cells. These results indicate that compartmentalized culture devices are promising tools for reconstructing network-wide connections between PNS neurons and various organs, and might help to understand patient-specific molecular and functional mechanisms under normal and pathological conditions.

  1. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  2. Systems genetics identifies a convergent gene network for cognition and neurodevelopmental disease.

    Science.gov (United States)

    Johnson, Michael R; Shkura, Kirill; Langley, Sarah R; Delahaye-Duriez, Andree; Srivastava, Prashant; Hill, W David; Rackham, Owen J L; Davies, Gail; Harris, Sarah E; Moreno-Moral, Aida; Rotival, Maxime; Speed, Doug; Petrovski, Slavé; Katz, Anaïs; Hayward, Caroline; Porteous, David J; Smith, Blair H; Padmanabhan, Sandosh; Hocking, Lynne J; Starr, John M; Liewald, David C; Visconti, Alessia; Falchi, Mario; Bottolo, Leonardo; Rossetti, Tiziana; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Grote, Alexander; Helmstaedter, Christoph; Becker, Albert J; Kaminski, Rafal M; Deary, Ian J; Petretto, Enrico

    2016-02-01

    Genetic determinants of cognition are poorly characterized, and their relationship to genes that confer risk for neurodevelopmental disease is unclear. Here we performed a systems-level analysis of genome-wide gene expression data to infer gene-regulatory networks conserved across species and brain regions. Two of these networks, M1 and M3, showed replicable enrichment for common genetic variants underlying healthy human cognitive abilities, including memory. Using exome sequence data from 6,871 trios, we found that M3 genes were also enriched for mutations ascertained from patients with neurodevelopmental disease generally, and intellectual disability and epileptic encephalopathy in particular. M3 consists of 150 genes whose expression is tightly developmentally regulated, but which are collectively poorly annotated for known functional pathways. These results illustrate how systems-level analyses can reveal previously unappreciated relationships between neurodevelopmental disease-associated genes in the developed human brain, and provide empirical support for a convergent gene-regulatory network influencing cognition and neurodevelopmental disease.

  3. Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice

    Directory of Open Access Journals (Sweden)

    Shuchi eSmita

    2015-12-01

    Full Text Available MYB transcription factor (TF is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by top down and guide gene approaches. More than 50% of OsMYBs were strongly correlated under fifty experimental conditions with 51 hub genes via top down approach. Further, clusters were identified using Markov Clustering (MCL. To maximize the clustering performance, parameter evaluation of the MCL inflation score (I was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by guide gene approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought

  4. Pathway reconstruction of airway remodeling in chronic lung diseases: a systems biology approach.

    Directory of Open Access Journals (Sweden)

    Ali Najafi

    Full Text Available Airway remodeling is a pathophysiologic process at the clinical, cellular, and molecular level relating to chronic obstructive airway diseases such as chronic obstructive pulmonary disease (COPD, asthma and mustard lung. These diseases are associated with the dysregulation of multiple molecular pathways in the airway cells. Little progress has so far been made in discovering the molecular causes of complex disease in a holistic systems manner. Therefore, pathway and network reconstruction is an essential part of a systems biology approach to solve this challenging problem. In this paper, multiple data sources were used to construct the molecular process of airway remodeling pathway in mustard lung as a model of airway disease. We first compiled a master list of genes that change with airway remodeling in the mustard lung disease and then reconstructed the pathway by generating and merging the protein-protein interaction and the gene regulatory networks. Experimental observations and literature mining were used to identify and validate the master list. The outcome of this paper can provide valuable information about closely related chronic obstructive airway diseases which are of great importance for biologists and their future research. Reconstructing the airway remodeling interactome provides a starting point and reference for the future experimental study of mustard lung, and further analysis and development of these maps will be critical to understanding airway diseases in patients.

  5. The capacity for multistability in small gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Grotewold Erich

    2009-09-01

    Full Text Available Abstract Background Recent years have seen a dramatic increase in the use of mathematical modeling to gain insight into gene regulatory network behavior across many different organisms. In particular, there has been considerable interest in using mathematical tools to understand how multistable regulatory networks may contribute to developmental processes such as cell fate determination. Indeed, such a network may subserve the formation of unicellular leaf hairs (trichomes in the model plant Arabidopsis thaliana. Results In order to investigate the capacity of small gene regulatory networks to generate multiple equilibria, we present a chemical reaction network (CRN-based modeling formalism and describe a number of methods for CRN analysis in a parameter-free context. These methods are compared and applied to a full set of one-component subnetworks, as well as a large random sample from 40,680 similarly constructed two-component subnetworks. We find that positive feedback and cooperativity mediated by transcription factor (TF dimerization is a requirement for one-component subnetwork bistability. For subnetworks with two components, the presence of these processes increases the probability that a randomly sampled subnetwork will exhibit multiple equilibria, although we find several examples of bistable two-component subnetworks that do not involve cooperative TF-promoter binding. In the specific case of epidermal differentiation in Arabidopsis, dimerization of the GL3-GL1 complex and cooperative sequential binding of GL3-GL1 to the CPC promoter are each independently sufficient for bistability. Conclusion Computational methods utilizing CRN-specific theorems to rule out bistability in small gene regulatory networks are far superior to techniques generally applicable to deterministic ODE systems. Using these methods to conduct an unbiased survey of parameter-free deterministic models of small networks, and the Arabidopsis epidermal cell

  6. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  7. Integration of metabolic and gene regulatory networks modulates the C. elegans dietary response.

    Science.gov (United States)

    Watson, Emma; MacNeil, Lesley T; Arda, H Efsun; Zhu, Lihua Julie; Walhout, Albertha J M

    2013-03-28

    Expression profiles are tailored according to dietary input. However, the networks that control dietary responses remain largely uncharacterized. Here, we combine forward and reverse genetic screens to delineate a network of 184 genes that affect the C. elegans dietary response to Comamonas DA1877 bacteria. We find that perturbation of a mitochondrial network composed of enzymes involved in amino acid metabolism and the TCA cycle affects the dietary response. In humans, mutations in the corresponding genes cause inborn diseases of amino acid metabolism, most of which are treated by dietary intervention. We identify several transcription factors (TFs) that mediate the changes in gene expression upon metabolic network perturbations. Altogether, our findings unveil a transcriptional response system that is poised to sense dietary cues and metabolic imbalances, illustrating extensive communication between metabolic networks in the mitochondria and gene regulatory networks in the nucleus. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. FastGCN: a GPU accelerated tool for fast gene co-expression networks.

    Directory of Open Access Journals (Sweden)

    Meimei Liang

    Full Text Available Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.

  9. Epigenetic Modulation of Brain Gene Networks for Cocaine and Alcohol Abuse

    Directory of Open Access Journals (Sweden)

    Sean P Farris

    2015-05-01

    Full Text Available Cocaine and alcohol are two substances of abuse that prominently affect the central nervous system (CNS. Repeated exposure to cocaine and alcohol leads to longstanding changes in gene expression, and subsequent functional CNS plasticity, throughout multiple brain regions. Epigenetic modifications of histones are one proposed mechanism guiding these enduring changes to the transcriptome. Characterizing the large number of available biological relationships as network models can reveal unexpected biochemical relationships. Clustering analysis of variation from whole-genome sequencing of gene expression (RNA-Seq and histone H3 lysine 4 trimethylation (H3K4me3 events (ChIP-Seq revealed the underlying structure of the transcriptional and epigenomic landscape within hippocampal postmortem brain tissue of drug abusers and control cases. Distinct sets of interrelated networks for cocaine and alcohol abuse were determined for each abusive substance. The network approach identified subsets of functionally related genes that are regulated in agreement with H3K4me3 changes, suggesting cause and effect relationships between this epigenetic mark and gene expression. Gene expression networks consisted of recognized substrates for addiction, such as the dopamine- and cAMP-regulated neuronal phosphoprotein PPP1R1B / DARPP-32 and the vesicular glutamate transporter SLC17A7 / VGLUT1 as well as potentially novel molecular targets for substance abuse. Through a systems biology based approach our results illustrate the utility of integrating epigenetic and transcript expression to establish relevant biological networks in the human brain for addiction. Future work with laboratory models may clarify the functional relevance of these gene networks for cocaine and alcohol, and provide a framework for the development of medications for the treatment of addiction.

  10. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  11. Photoacoustic image reconstruction via deep learning

    Science.gov (United States)

    Antholzer, Stephan; Haltmeier, Markus; Nuster, Robert; Schwab, Johannes

    2018-02-01

    Applying standard algorithms to sparse data problems in photoacoustic tomography (PAT) yields low-quality images containing severe under-sampling artifacts. To some extent, these artifacts can be reduced by iterative image reconstruction algorithms which allow to include prior knowledge such as smoothness, total variation (TV) or sparsity constraints. These algorithms tend to be time consuming as the forward and adjoint problems have to be solved repeatedly. Further, iterative algorithms have additional drawbacks. For example, the reconstruction quality strongly depends on a-priori model assumptions about the objects to be recovered, which are often not strictly satisfied in practical applications. To overcome these issues, in this paper, we develop direct and efficient reconstruction algorithms based on deep learning. As opposed to iterative algorithms, we apply a convolutional neural network, whose parameters are trained before the reconstruction process based on a set of training data. For actual image reconstruction, a single evaluation of the trained network yields the desired result. Our presented numerical results (using two different network architectures) demonstrate that the proposed deep learning approach reconstructs images with a quality comparable to state of the art iterative reconstruction methods.

  12. Network analysis of S. aureus response to ramoplanin reveals modules for virulence factors and resistance mechanisms and characteristic novel genes.

    Science.gov (United States)

    Subramanian, Devika; Natarajan, Jeyakumar

    2015-12-10

    Staphylococcus aureus is a major human pathogen and ramoplanin is an antimicrobial attributed for effective treatment. The goal of this study was to examine the transcriptomic profiles of ramoplanin sensitive and resistant S. aureus to identify putative modules responsible for virulence and resistance-mechanisms and its characteristic novel genes. The dysregulated genes were used to reconstruct protein functional association networks for virulence-factors and resistance-mechanisms individually. Strong link between metabolic-pathways and development of virulence/resistance is suggested. We identified 15 putative modules of virulence factors. Six hypothetical genes were annotated with novel virulence activity among which SACOL0281 was discovered to be an essential virulence factor EsaD. The roles of MazEF toxin-antitoxin system, SACOL0202/SACOL0201 two-component system and that of amino-sugar and nucleotide-sugar metabolism in virulence are also suggested. In addition, 14 putative modules of resistance mechanisms including modules of ribosomal protein-coding genes and metabolic pathways such as biotin-synthesis, TCA-cycle, riboflavin-biosynthesis, peptidoglycan-biosynthesis etc. are also indicated. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Overview of the neural network based technique for monitoring of road condition via reconstructed road profiles

    CSIR Research Space (South Africa)

    Ngwangwa, HM

    2008-07-01

    Full Text Available on the road and driver to assess the integrity of road and vehicle infrastructure. In this paper, vehicle vibration data are applied to an artificial neural network to reconstruct the corresponding road surface profiles. The results show that the technique...

  14. Estimating immunoregulatory gene networks in human herpesvirus type 6-infected T cells

    International Nuclear Information System (INIS)

    Takaku, Tomoiku; Ohyashiki, Junko H.; Zhang, Yu; Ohyashiki, Kazuma

    2005-01-01

    The immune response to viral infection involves complex network of dynamic gene and protein interactions. We present here the dynamic gene network of the host immune response during human herpesvirus type 6 (HHV-6) infection in an adult T-cell leukemia cell line. Using a pathway-focused oligonucleotide DNA microarray, we found a possible association between chemokine genes regulating Th1/Th2 balance and genes regulating T-cell proliferation during HHV-6B infection. Gene network analysis using an integrated comprehensive workbench, VoyaGene, revealed that a gene encoding a TEC-family kinase, ITK, might be a putative modulator in the host immune response against HHV-6B infection. We conclude that Th2-dominated inflammatory reaction in host cells may play an important role in HHV-6B-infected T cells, thereby suggesting the possibility that ITK might be a therapeutic target in diseases related to dysregulation of Th1/Th2 balance. This study describes a novel approach to find genes related with the complex host-virus interaction using microarray data employing the Bayesian statistical framework

  15. Inferring gene networks from discrete expression data

    KAUST Repository

    Zhang, L.; Mallick, B. K.

    2013-01-01

    graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which

  16. Reconstruction of Daily Sea Surface Temperature Based on Radial Basis Function Networks

    Directory of Open Access Journals (Sweden)

    Zhihong Liao

    2017-11-01

    Full Text Available A radial basis function network (RBFN method is proposed to reconstruct daily Sea surface temperatures (SSTs with limited SST samples. For the purpose of evaluating the SSTs using this method, non-biased SST samples in the Pacific Ocean (10°N–30°N, 115°E–135°E are selected when the tropical storm Hagibis arrived in June 2014, and these SST samples are obtained from the Reynolds optimum interpolation (OI v2 daily 0.25° SST (OISST products according to the distribution of AVHRR L2p SST and in-situ SST data. Furthermore, an improved nearest neighbor cluster (INNC algorithm is designed to search for the optimal hidden knots for RBFNs from both the SST samples and the background fields. Then, the reconstructed SSTs from the RBFN method are compared with the results from the OI method. The statistical results show that the RBFN method has a better performance of reconstructing SST than the OI method in the study, and that the average RMSE is 0.48 °C for the RBFN method, which is quite smaller than the value of 0.69 °C for the OI method. Additionally, the RBFN methods with different basis functions and clustering algorithms are tested, and we discover that the INNC algorithm with multi-quadric function is quite suitable for the RBFN method to reconstruct SSTs when the SST samples are sparsely distributed.

  17. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  18. Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks.

    Science.gov (United States)

    Zhou, Xuezhong; Liu, Baoyan; Wu, Zhaohui; Feng, Yi

    2007-10-01

    The amount of biomedical data in different disciplines is growing at an exponential rate. Integrating these significant knowledge sources to generate novel hypotheses for systems biology research is difficult. Traditional Chinese medicine (TCM) is a completely different discipline, and is a complementary knowledge system to modern biomedical science. This paper uses a significant TCM bibliographic literature database in China, together with MEDLINE, to help discover novel gene functional knowledge. We present an integrative mining approach to uncover the functional gene relationships from MEDLINE and TCM bibliographic literature. This paper introduces TCM literature (about 50,000 records) as one knowledge source for constructing literature-based gene networks. We use the TCM diagnosis, TCM syndrome, to automatically congregate the related genes. The syndrome-gene relationships are discovered based on the syndrome-disease relationships extracted from TCM literature and the disease-gene relationships in MEDLINE. Based on the bubble-bootstrapping and relation weight computing methods, we have developed a prototype system called MeDisco/3S, which has name entity and relation extraction, and online analytical processing (OLAP) capabilities, to perform the integrative mining process. We have got about 200,000 syndrome-gene relations, which could help generate syndrome-based gene networks, and help analyze the functional knowledge of genes from syndrome perspective. We take the gene network of Kidney-Yang Deficiency syndrome (KYD syndrome) and the functional analysis of some genes, such as CRH (corticotropin releasing hormone), PTH (parathyroid hormone), PRL (prolactin), BRCA1 (breast cancer 1, early onset) and BRCA2 (breast cancer 2, early onset), to demonstrate the preliminary results. The underlying hypothesis is that the related genes of the same syndrome will have some biological functional relationships, and will constitute a functional network. This paper presents

  19. A network approach to predict pathogenic genes for Fusarium graminearum.

    Science.gov (United States)

    Liu, Xiaoping; Tang, Wei-Hua; Zhao, Xing-Ming; Chen, Luonan

    2010-10-04

    Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB), which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN) of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other pathogenic fungi, which

  20. A network approach to predict pathogenic genes for Fusarium graminearum.

    Directory of Open Access Journals (Sweden)

    Xiaoping Liu

    Full Text Available Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB, which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other

  1. An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series.

    Science.gov (United States)

    Wang, Zidong; Liu, Xiaohui; Liu, Yurong; Liang, Jinling; Vinciotti, Veronica

    2009-01-01

    In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels. It is shown that the EKF algorithm is an online estimation algorithm that can identify a large number of parameters (including parameters of nonlinear functions) through iterative procedure by using a small number of observations. Four real-world gene expression data sets are employed to demonstrate the effectiveness of the EKF algorithm, and the obtained models are evaluated from the viewpoint of bioinformatics.

  2. Developing integrated crop knowledge networks to advance candidate gene discovery.

    Science.gov (United States)

    Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

    2016-12-01

    The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.

  3. A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

    Science.gov (United States)

    Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.

    2013-01-01

    The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474

  4. A network approach to analyzing highly recombinant malaria parasite genes.

    Science.gov (United States)

    Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O

    2013-01-01

    The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

  5. A network approach to analyzing highly recombinant malaria parasite genes.

    Directory of Open Access Journals (Sweden)

    Daniel B Larremore

    Full Text Available The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs, and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

  6. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  7. Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.

    Science.gov (United States)

    Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui

    2017-01-01

    The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.

  8. [Weighted gene co-expression network analysis in biomedicine research].

    Science.gov (United States)

    Liu, Wei; Li, Li; Ye, Hua; Tu, Wei

    2017-11-25

    High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.

  9. Network-based association of hypoxia-responsive genes with cardiovascular diseases

    International Nuclear Information System (INIS)

    Wang, Rui-Sheng; Oldham, William M; Loscalzo, Joseph

    2014-01-01

    Molecular oxygen is indispensable for cellular viability and function. Hypoxia is a stress condition in which oxygen demand exceeds supply. Low cellular oxygen content induces a number of molecular changes to activate regulatory pathways responsible for increasing the oxygen supply and optimizing cellular metabolism under limited oxygen conditions. Hypoxia plays critical roles in the pathobiology of many diseases, such as cancer, heart failure, myocardial ischemia, stroke, and chronic lung diseases. Although the complicated associations between hypoxia and cardiovascular (and cerebrovascular) diseases (CVD) have been recognized for some time, there are few studies that investigate their biological link from a systems biology perspective. In this study, we integrate hypoxia genes, CVD genes, and the human protein interactome in order to explore the relationship between hypoxia and cardiovascular diseases at a systems level. We show that hypoxia genes are much closer to CVD genes in the human protein interactome than that expected by chance. We also find that hypoxia genes play significant bridging roles in connecting different cardiovascular diseases. We construct a hypoxia-CVD bipartite network and find several interesting hypoxia-CVD modules with significant gene ontology similarity. Finally, we show that hypoxia genes tend to have more CVD interactors in the human interactome than in random networks of matching topology. Based on these observations, we can predict novel genes that may be associated with CVD. This network-based association study gives us a broad view of the relationships between hypoxia and cardiovascular diseases and provides new insights into the role of hypoxia in cardiovascular biology. (paper)

  10. Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer

    Science.gov (United States)

    Pearson, T.; Giffard, P.; Beckstrom-Sternberg, S.; Auerbach, R.; Hornstra, H.; Tuanyok, A.; Price, E.P.; Glass, M.B.; Leadem, B.; Beckstrom-Sternberg, J. S.; Allan, G.J.; Foster, J.T.; Wagner, D.M.; Okinaka, R.T.; Sim, S.H.; Pearson, O.; Wu, Z.; Chang, J.; Kaul, R.; Hoffmaster, A.R.; Brettin, T.S.; Robison, R.A.; Mayo, M.; Gee, J.E.; Tan, P.; Currie, B.J.; Keim, P.

    2009-01-01

    Background: Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results: Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion: We describe an

  11. Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer

    Directory of Open Access Journals (Sweden)

    Kaul Rajinder

    2009-11-01

    Full Text Available Abstract Background Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia

  12. Discovering hidden relationships between renal diseases and regulated genes through 3D network visualizations

    Directory of Open Access Journals (Sweden)

    Bhavnani Suresh K

    2010-11-01

    Full Text Available Abstract Background In a recent study, two-dimensional (2D network layouts were used to visualize and quantitatively analyze the relationship between chronic renal diseases and regulated genes. The results revealed complex relationships between disease type, gene specificity, and gene regulation type, which led to important insights about the underlying biological pathways. Here we describe an attempt to extend our understanding of these complex relationships by reanalyzing the data using three-dimensional (3D network layouts, displayed through 2D and 3D viewing methods. Findings The 3D network layout (displayed through the 3D viewing method revealed that genes implicated in many diseases (non-specific genes tended to be predominantly down-regulated, whereas genes regulated in a few diseases (disease-specific genes tended to be up-regulated. This new global relationship was quantitatively validated through comparison to 1000 random permutations of networks of the same size and distribution. Our new finding appeared to be the result of using specific features of the 3D viewing method to analyze the 3D renal network. Conclusions The global relationship between gene regulation and gene specificity is the first clue from human studies that there exist common mechanisms across several renal diseases, which suggest hypotheses for the underlying mechanisms. Furthermore, the study suggests hypotheses for why the 3D visualization helped to make salient a new regularity that was difficult to detect in 2D. Future research that tests these hypotheses should enable a more systematic understanding of when and how to use 3D network visualizations to reveal complex regularities in biological networks.

  13. Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.

    Science.gov (United States)

    Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J

    2016-11-04

    Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types

  14. Identifying noncoding risk variants using disease-relevant gene regulatory networks.

    Science.gov (United States)

    Gao, Long; Uzun, Yasin; Gao, Peng; He, Bing; Ma, Xiaoke; Wang, Jiahui; Han, Shizhong; Tan, Kai

    2018-02-16

    Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations.

  15. Integration of gene expression and methylation to unravel biological networks in glioblastoma patients.

    Science.gov (United States)

    Gadaleta, Francesco; Bessonov, Kyrylo; Van Steen, Kristel

    2017-02-01

    The vast amount of heterogeneous omics data, encompassing a broad range of biomolecular information, requires novel methods of analysis, including those that integrate the available levels of information. In this work, we describe Regression2Net, a computational approach that is able to integrate gene expression and genomic or methylation data in two steps. First, penalized regressions are used to build Expression-Expression (EEnet) and Expression-Genomic or Expression-Methylation (EMnet) networks. Second, network theory is used to highlight important communities of genes. When applying our approach, Regression2Net to gene expression and methylation profiles for individuals with glioblastoma multiforme, we identified, respectively, 284 and 447 potentially interesting genes in relation to glioblastoma pathology. These genes showed at least one connection in the integrated networks ANDnet and XORnet derived from aforementioned EEnet and EMnet networks. Although the edges in ANDnet occur in both EEnet and EMnet, the edges in XORnet occur in EMnet but not in EEnet. In-depth biological analysis of connected genes in ANDnet and XORnet revealed genes that are related to energy metabolism, cell cycle control (AATF), immune system response, and several cancer types. Importantly, we observed significant overrepresentation of cancer-related pathways including glioma, especially in the XORnet network, suggesting a nonignorable role of methylation in glioblastoma multiforma. In the ANDnet, we furthermore identified potential glioma suppressor genes ACCN3 and ACCN4 linked to the NBPF1 neuroblastoma breakpoint family, as well as numerous ABC transporter genes (ABCA1, ABCB1) suggesting drug resistance of glioblastoma tumors. © 2016 WILEY PERIODICALS, INC.

  16. Co-regulation of metabolic genes is better explained by flux coupling than by network distance.

    Directory of Open Access Journals (Sweden)

    Richard A Notebaart

    2008-01-01

    Full Text Available To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naĂŻve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools.

  17. Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

    Science.gov (United States)

    Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

    2017-08-01

    Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.

  18. Evolutionary conservation and network structure characterize genes of phenotypic relevance for mitosis in human.

    Directory of Open Access Journals (Sweden)

    Marek Ostaszewski

    Full Text Available The impact of gene silencing on cellular phenotypes is difficult to establish due to the complexity of interactions in the associated biological processes and pathways. A recent genome-wide RNA knock-down study both identified and phenotypically characterized a set of important genes for the cell cycle in HeLa cells. Here, we combine a molecular interaction network analysis, based on physical and functional protein interactions, in conjunction with evolutionary information, to elucidate the common biological and topological properties of these key genes. Our results show that these genes tend to be conserved with their corresponding protein interactions across several species and are key constituents of the evolutionary conserved molecular interaction network. Moreover, a group of bistable network motifs is found to be conserved within this network, which are likely to influence the network stability and therefore the robustness of cellular functioning. They form a cluster, which displays functional homogeneity and is significantly enriched in genes phenotypically relevant for mitosis. Additional results reveal a relationship between specific cellular processes and the phenotypic outcomes induced by gene silencing. This study introduces new ideas regarding the relationship between genotype and phenotype in the context of the cell cycle. We show that the analysis of molecular interaction networks can result in the identification of genes relevant to cellular processes, which is a promising avenue for future research.

  19. In-silico gene co-expression network analysis in Paracoccidioides brasiliensis with reference to haloacid dehalogenase superfamily hydrolase gene

    Directory of Open Access Journals (Sweden)

    Raghunath Satpathy

    2015-01-01

    Full Text Available Context: Paracoccidioides brasiliensis, a dimorphic fungus is the causative agent of paracoccidioidomycosis, a disease globally affecting millions of people. The haloacid dehalogenase (HAD superfamily hydrolases enzyme in the fungi, in particular, is known to be responsible in the pathogenesis by adhering to the tissue. Hence, identification of novel drug targets is essential. Aims: In-silico based identification of co-expressed genes along with HAD superfamily hydrolase in P. brasiliensis during the morphogenesis from mycelium to yeast to identify possible genes as drug targets. Materials and Methods: In total, four datasets were retrieved from the NCBI-gene expression omnibus (GEO database, each containing 4340 genes, followed by gene filtration expression of the data set. Further co-expression (CE study was performed individually and then a combination these genes were visualized in the Cytoscape 2. 8.3. Statistical Analysis Used: Mean and standard deviation value of the HAD superfamily hydrolase gene was obtained from the expression data and this value was subsequently used for the CE calculation purpose by selecting specific correlation power and filtering threshold. Results: The 23 genes that were thus obtained are common with respect to the HAD superfamily hydrolase gene. A significant network was selected from the Cytoscape network visualization that contains total 7 genes out of which 5 genes, which do not have significant protein hits, obtained from gene annotation of the expressed sequence tags by BLAST X. For all the protein PSI-BLAST was performed against human genome to find the homology. Conclusions: The gene co-expression network was obtained with respect to HAD superfamily dehalogenase gene in P. Brasiliensis.

  20. Long-term oil contamination alters the molecular ecological networks of soil microbial functional genes

    Directory of Open Access Journals (Sweden)

    Yuting eLiang

    2016-02-01

    Full Text Available With knowledge on microbial composition and diversity, investigation of within-community interactions is a further step to elucidate microbial ecological functions, such as the biodegradation of hazardous contaminants. In this work, microbial functional molecular ecological networks were studied in both contaminated and uncontaminated soils to determine the possible influences of oil contamination on microbial interactions and potential functions. Soil samples were obtained from an oil-exploring site located in South China, and the microbial functional genes were analyzed with GeoChip, a high-throughput functional microarray. By building random networks based on null model, we demonstrated that overall network structures and properties were significantly different between contaminated and uncontaminated soils (P < 0.001. Network connectivity, module numbers, and modularity were all reduced with contamination. Moreover, the topological roles of the genes (module hub and connectors were altered with oil contamination. Subnetworks of genes involved in alkane and polycyclic aromatic hydrocarbon degradation were also constructed. Negative co-occurrence patterns prevailed among functional genes, thereby indicating probable competition relationships. The potential keystone genes, defined as either hubs or genes with highest connectivities in the network, were further identified. The network constructed in this study predicted the potential effects of anthropogenic contamination on microbial community co-occurrence interactions.

  1. Bayesian Models for Streamflow and River Network Reconstruction using Tree Rings

    Science.gov (United States)

    Ravindranath, A.; Devineni, N.

    2016-12-01

    Water systems face non-stationary, dynamically shifting risks due to shifting societal conditions and systematic long-term variations in climate manifesting as quasi-periodic behavior on multi-decadal time scales. Water systems are thus vulnerable to long periods of wet or dry hydroclimatic conditions. Streamflow is a major component of water systems and a primary means by which water is transported to serve ecosystems' and human needs. Thus, our concern is in understanding streamflow variability. Climate variability and impacts on water resources are crucial factors affecting streamflow, and multi-scale variability increases risk to water sustainability and systems. Dam operations are necessary for collecting water brought by streamflow while maintaining downstream ecological health. Rules governing dam operations are based on streamflow records that are woefully short compared to periods of systematic variation present in the climatic factors driving streamflow variability and non-stationarity. We use hierarchical Bayesian regression methods in order to reconstruct paleo-streamflow records for dams within a basin using paleoclimate proxies (e.g. tree rings) to guide the reconstructions. The riverine flow network for the entire basin is subsequently modeled hierarchically using feeder stream and tributary flows. This is a starting point in analyzing streamflow variability and risks to water systems, and developing a scientifically-informed dynamic risk management framework for formulating dam operations and water policies to best hedge such risks. We will apply this work to the Missouri and Delaware River Basins (DRB). Preliminary results of streamflow reconstructions for eight dams in the upper DRB using standard Gaussian regression with regional tree ring chronologies give streamflow records that now span two to two and a half centuries, and modestly smoothed versions of these reconstructed flows indicate physically-justifiable trends in the time series.

  2. Comprehensive phylogenetic reconstruction of amoebozoa based on concatenated analyses of SSU-rDNA and actin genes.

    Directory of Open Access Journals (Sweden)

    Daniel J G Lahr

    Full Text Available Evolutionary relationships within Amoebozoa have been the subject of controversy for two reasons: 1 paucity of morphological characters in traditional surveys and 2 haphazard taxonomic sampling in modern molecular reconstructions. These along with other factors have prevented the erection of a definitive system that resolves confidently both higher and lower-level relationships. Additionally, the recent recognition that many protosteloid amoebae are in fact scattered throughout the Amoebozoa suggests that phylogenetic reconstructions have been excluding an extensive and integral group of organisms. Here we provide a comprehensive phylogenetic reconstruction based on 139 taxa using molecular information from both SSU-rDNA and actin genes. We provide molecular data for 13 of those taxa, 12 of which had not been previously characterized. We explored the dataset extensively by generating 18 alternative reconstructions that assess the effect of missing data, long-branched taxa, unstable taxa, fast evolving sites and inclusion of environmental sequences. We compared reconstructions with each other as well as against previously published phylogenies. Our analyses show that many of the morphologically established lower-level relationships (defined here as relationships roughly equivalent to Order level or below are congruent with molecular data. However, the data are insufficient to corroborate or reject the large majority of proposed higher-level relationships (above the Order-level, with the exception of Tubulinea, Archamoebae and Myxogastrea, which are consistently recovered. Moreover, contrary to previous expectations, the inclusion of available environmental sequences does not significantly improve the Amoebozoa reconstruction. This is probably because key amoebozoan taxa are not easily amplified by environmental sequencing methodology due to high rates of molecular evolution and regular occurrence of large indels and introns. Finally, in an effort

  3. Comprehensive phylogenetic reconstruction of amoebozoa based on concatenated analyses of SSU-rDNA and actin genes.

    Science.gov (United States)

    Lahr, Daniel J G; Grant, Jessica; Nguyen, Truc; Lin, Jian Hua; Katz, Laura A

    2011-01-01

    Evolutionary relationships within Amoebozoa have been the subject of controversy for two reasons: 1) paucity of morphological characters in traditional surveys and 2) haphazard taxonomic sampling in modern molecular reconstructions. These along with other factors have prevented the erection of a definitive system that resolves confidently both higher and lower-level relationships. Additionally, the recent recognition that many protosteloid amoebae are in fact scattered throughout the Amoebozoa suggests that phylogenetic reconstructions have been excluding an extensive and integral group of organisms. Here we provide a comprehensive phylogenetic reconstruction based on 139 taxa using molecular information from both SSU-rDNA and actin genes. We provide molecular data for 13 of those taxa, 12 of which had not been previously characterized. We explored the dataset extensively by generating 18 alternative reconstructions that assess the effect of missing data, long-branched taxa, unstable taxa, fast evolving sites and inclusion of environmental sequences. We compared reconstructions with each other as well as against previously published phylogenies. Our analyses show that many of the morphologically established lower-level relationships (defined here as relationships roughly equivalent to Order level or below) are congruent with molecular data. However, the data are insufficient to corroborate or reject the large majority of proposed higher-level relationships (above the Order-level), with the exception of Tubulinea, Archamoebae and Myxogastrea, which are consistently recovered. Moreover, contrary to previous expectations, the inclusion of available environmental sequences does not significantly improve the Amoebozoa reconstruction. This is probably because key amoebozoan taxa are not easily amplified by environmental sequencing methodology due to high rates of molecular evolution and regular occurrence of large indels and introns. Finally, in an effort to facilitate

  4. Construction of coffee transcriptome networks based on gene annotation semantics

    Directory of Open Access Journals (Sweden)

    Castillo Luis F.

    2012-12-01

    Full Text Available Gene annotation is a process that encompasses multiple approaches on the analysis of nucleic acids or protein sequences in order to assign structural and functional characteristics to gene models. When thousands of gene models are being described in an organism genome, construction and visualization of gene networks impose novel challenges in the understanding of complex expression patterns and the generation of new knowledge in genomics research. In order to take advantage of accumulated text data after conventional gene sequence analysis, this work applied semantics in combination with visualization tools to build transcriptome networks from a set of coffee gene annotations. A set of selected coffee transcriptome sequences, chosen by the quality of the sequence comparison reported by Basic Local Alignment Search Tool (BLAST and Interproscan, were filtered out by coverage, identity, length of the query, and e-values. Meanwhile, term descriptors for molecular biology and biochemistry were obtained along the Wordnet dictionary in order to construct a Resource Description Framework (RDF using Ruby scripts and Methontology to find associations between concepts. Relationships between sequence annotations and semantic concepts were graphically represented through a total of 6845 oriented vectors, which were reduced to 745 non-redundant associations. A large gene network connecting transcripts by way of relational concepts was created where detailed connections remain to be validated for biological significance based on current biochemical and genetics frameworks. Besides reusing text information in the generation of gene connections and for data mining purposes, this tool development opens the possibility to visualize complex and abundant transcriptome data, and triggers the formulation of new hypotheses in metabolic pathways analysis.

  5. Text mining and network analysis to find functional associations of genes in high altitude diseases.

    Science.gov (United States)

    Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar

    2018-05-02

    Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Transcriptional dynamics of a conserved gene expression network associated with craniofacial divergence in Arctic charr.

    Science.gov (United States)

    Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut

    2014-01-01

    Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional

  7. Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

    Science.gov (United States)

    Fang, Xin; Sastry, Anand; Mih, Nathan; Kim, Donghyuk; Tan, Justin; Lloyd, Colton J.; Gao, Ye; Yang, Laurence; Palsson, Bernhard O.

    2017-01-01

    Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types. PMID:28874552

  8. CoryneRegNet: an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks.

    Science.gov (United States)

    Baumbach, Jan; Brinkrolf, Karina; Czaja, Lisa F; Rahmann, Sven; Tauch, Andreas

    2006-02-14

    The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.

  9. Genetic dissection of acute ethanol responsive gene networks in prefrontal cortex: functional and mechanistic implications.

    Directory of Open Access Journals (Sweden)

    Aaron R Wolen

    Full Text Available Individual differences in initial sensitivity to ethanol are strongly related to the heritable risk of alcoholism in humans. To elucidate key molecular networks that modulate ethanol sensitivity we performed the first systems genetics analysis of ethanol-responsive gene expression in brain regions of the mesocorticolimbic reward circuit (prefrontal cortex, nucleus accumbens, and ventral midbrain across a highly diverse family of 27 isogenic mouse strains (BXD panel before and after treatment with ethanol.Acute ethanol altered the expression of ~2,750 genes in one or more regions and 400 transcripts were jointly modulated in all three. Ethanol-responsive gene networks were extracted with a powerful graph theoretical method that efficiently summarized ethanol's effects. These networks correlated with acute behavioral responses to ethanol and other drugs of abuse. As predicted, networks were heavily populated by genes controlling synaptic transmission and neuroplasticity. Several of the most densely interconnected network hubs, including Kcnma1 and Gsk3β, are known to influence behavioral or physiological responses to ethanol, validating our overall approach. Other major hub genes like Grm3, Pten and Nrg3 represent novel targets of ethanol effects. Networks were under strong genetic control by variants that we mapped to a small number of chromosomal loci. Using a novel combination of genetic, bioinformatic and network-based approaches, we identified high priority cis-regulatory candidate genes, including Scn1b, Gria1, Sncb and Nell2.The ethanol-responsive gene networks identified here represent a previously uncharacterized intermediate phenotype between DNA variation and ethanol sensitivity in mice. Networks involved in synaptic transmission were strongly regulated by ethanol and could contribute to behavioral plasticity seen with chronic ethanol. Our novel finding that hub genes and a small number of loci exert major influence over the ethanol

  10. Statistical assessment of crosstalk enrichment between gene groups in biological networks.

    Science.gov (United States)

    McCormack, Theodore; Frings, Oliver; Alexeyenko, Andrey; Sonnhammer, Erik L L

    2013-01-01

    Analyzing groups of functionally coupled genes or proteins in the context of global interaction networks has become an important aspect of bioinformatic investigations. Assessing the statistical significance of crosstalk enrichment between or within groups of genes can be a valuable tool for functional annotation of experimental gene sets. Here we present CrossTalkZ, a statistical method and software to assess the significance of crosstalk enrichment between pairs of gene or protein groups in large biological networks. We demonstrate that the standard z-score is generally an appropriate and unbiased statistic. We further evaluate the ability of four different methods to reliably recover crosstalk within known biological pathways. We conclude that the methods preserving the second-order topological network properties perform best. Finally, we show how CrossTalkZ can be used to annotate experimental gene sets using known pathway annotations and that its performance at this task is superior to gene enrichment analysis (GEA). CrossTalkZ (available at http://sonnhammer.sbc.su.se/download/software/CrossTalkZ/) is implemented in C++, easy to use, fast, accepts various input file formats, and produces a number of statistics. These include z-score, p-value, false discovery rate, and a test of normality for the null distributions.

  11. Network-Guided Key Gene Discovery for a Given Cellular Process

    DEFF Research Database (Denmark)

    He, Feng Q; Ollert, Markus

    2018-01-01

    Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...

  12. Orthoscape: a cytoscape application for grouping and visualization KEGG based gene networks by taxonomy and homology principles.

    Science.gov (United States)

    Mustafin, Zakhar Sergeevich; Lashin, Sergey Alexandrovich; Matushkin, Yury Georgievich; Gunbin, Konstantin Vladimirovich; Afonnikov, Dmitry Arkadievich

    2017-01-27

    There are many available software tools for visualization and analysis of biological networks. Among them, Cytoscape ( http://cytoscape.org/ ) is one of the most comprehensive packages, with many plugins and applications which extends its functionality by providing analysis of protein-protein interaction, gene regulatory and gene co-expression networks, metabolic, signaling, neural as well as ecological-type networks including food webs, communities networks etc. Nevertheless, only three plugins tagged 'network evolution' found in Cytoscape official app store and in literature. We have developed a new Cytoscape 3.0 application Orthoscape aimed to facilitate evolutionary analysis of gene networks and visualize the results. Orthoscape aids in analysis of evolutionary information available for gene sets and networks by highlighting: (1) the orthology relationships between genes; (2) the evolutionary origin of gene network components; (3) the evolutionary pressure mode (diversifying or stabilizing, negative or positive selection) of orthologous groups in general and/or branch-oriented mode. The distinctive feature of Orthoscape is the ability to control all data analysis steps via user-friendly interface. Orthoscape allows its users to analyze gene networks or separated gene sets in the context of evolution. At each step of data analysis, Orthoscape also provides for convenient visualization and data manipulation.

  13. A dynamic evolutionary clustering perspective: Community detection in signed networks by reconstructing neighbor sets

    Science.gov (United States)

    Chen, Jianrui; Wang, Hua; Wang, Lina; Liu, Weiwei

    2016-04-01

    Community detection in social networks has been intensively studied in recent years. In this paper, a novel similarity measurement is defined according to social balance theory for signed networks. Inter-community positive links are found and deleted due to their low similarity. The positive neighbor sets are reconstructed by this method. Then, differential equations are proposed to imitate the constantly changing states of nodes. Each node will update its state based on the difference between its state and average state of its positive neighbors. Nodes in the same community will evolve together with time and nodes in the different communities will evolve far away. Communities are detected ultimately when states of nodes are stable. Experiments on real world and synthetic networks are implemented to verify detection performance. The thorough comparisons demonstrate the presented method is more efficient than two acknowledged better algorithms.

  14. Summer drought reconstruction in northeastern Spain inferred from a tree ring latewood network since 1734

    Science.gov (United States)

    Tejedor, E.; Saz, M. A.; Esper, J.; Cuadrat, J. M.; de Luis, M.

    2017-08-01

    Drought recurrence in the Mediterranean is regarded as a fundamental factor for socioeconomic development and the resilience of natural systems in context of global change. However, knowledge of past droughts has been hampered by the absence of high-resolution proxies. We present a drought reconstruction for the northeast of the Iberian Peninsula based on a new dendrochronology network considering the Standardized Evapotranspiration Precipitation Index (SPEI). A total of 774 latewood width series from 387 trees of P. sylvestris and P. uncinata was combined in an interregional chronology. The new chronology, calibrated against gridded climate data, reveals a robust relationship with the SPEI representing drought conditions of July and August. We developed a summer drought reconstruction for the period 1734-2013 representative for the northeastern and central Iberian Peninsula. We identified 16 extremely dry and 17 extremely wet summers and four decadal scale dry and wet periods, including 2003-2013 as the driest episode of the reconstruction.

  15. Robustness and accuracy in sea urchin developmental gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Smadar eBen-Tabou De-Leon

    2016-02-01

    Full Text Available Developmental gene regulatory networks robustly control the timely activation of regulatory and differentiation genes. The structure of these networks underlies their capacity to buffer intrinsic and extrinsic noise and maintain embryonic morphology. Here I illustrate how the use of specific architectures by the sea urchin developmental regulatory networks enables the robust control of cell fate decisions. The Wnt-βcatenin signaling pathway patterns the primary embryonic axis while the BMP signaling pathway patterns the secondary embryonic axis in the sea urchin embryo and across bilateria. Interestingly, in the sea urchin in both cases, the signaling pathway that defines the axis controls directly the expression of a set of downstream regulatory genes. I propose that this direct activation of a set of regulatory genes enables a uniform regulatory response and a clear cut cell fate decision in the endoderm and in the dorsal ectoderm. The specification of the mesodermal pigment cell lineage is activated by Delta signaling that initiates a triple positive feedback loop that locks down the pigment specification state. I propose that the use of compound positive feedback circuitry provides the endodermal cells enough time to turn off mesodermal genes and ensures correct mesoderm vs. endoderm fate decision. Thus, I argue that understanding the control properties of repeatedly used regulatory architectures illuminates their role in embryogenesis and provides possible explanations to their resistance to evolutionary change.

  16. Cooperative adaptive responses in gene regulatory networks with many degrees of freedom.

    Science.gov (United States)

    Inoue, Masayo; Kaneko, Kunihiko

    2013-04-01

    Cells generally adapt to environmental changes by first exhibiting an immediate response and then gradually returning to their original state to achieve homeostasis. Although simple network motifs consisting of a few genes have been shown to exhibit such adaptive dynamics, they do not reflect the complexity of real cells, where the expression of a large number of genes activates or represses other genes, permitting adaptive behaviors. Here, we investigated the responses of gene regulatory networks containing many genes that have undergone numerical evolution to achieve high fitness due to the adaptive response of only a single target gene; this single target gene responds to changes in external inputs and later returns to basal levels. Despite setting a single target, most genes showed adaptive responses after evolution. Such adaptive dynamics were not due to common motifs within a few genes; even without such motifs, almost all genes showed adaptation, albeit sometimes partial adaptation, in the sense that expression levels did not always return to original levels. The genes split into two groups: genes in the first group exhibited an initial increase in expression and then returned to basal levels, while genes in the second group exhibited the opposite changes in expression. From this model, genes in the first group received positive input from other genes within the first group, but negative input from genes in the second group, and vice versa. Thus, the adaptation dynamics of genes from both groups were consolidated. This cooperative adaptive behavior was commonly observed if the number of genes involved was larger than the order of ten. These results have implications in the collective responses of gene expression networks in microarray measurements of yeast Saccharomyces cerevisiae and the significance to the biological homeostasis of systems with many components.

  17. Challenges for modeling global gene regulatory networks during development: insights from Drosophila.

    Science.gov (United States)

    Wilczynski, Bartek; Furlong, Eileen E M

    2010-04-15

    Development is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative "coarse-grain" models operating at the gene level to very "fine-grain" quantitative models operating at the biophysical "transcription factor-DNA level". Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans. Copyright (c) 2009 Elsevier Inc. All rights reserved.

  18. A new algorithm for $H\\rightarrow\\tau\\bar{\\tau}$ invariant mass reconstruction using Deep Neural Networks

    CERN Document Server

    Dietrich, Felix

    2017-01-01

    Reconstructing the invariant mass in a Higgs boson decay event containing tau leptons turns out to be a challenging endeavour. The aim of this summer student project is to implement a new algorithm for this task, using deep neural networks and machine learning. The results are compared to SVFit, an existing algorithm that uses dynamical likelihood techniques. A neural network is found that reaches the accuracy of SVFit at low masses and even surpasses it at higher masses, while at the same time providing results a thousand times faster.

  19. Mining for novel candidate clock genes in the circadian regulatory network

    OpenAIRE

    Bhargava, Anuprabha; Herzel, Hanspeter; Ananthasubramaniam, Bharath

    2015-01-01

    Background Most physiological processes in mammals are temporally regulated by means of a master circadian clock in the brain and peripheral oscillators in most other tissues. A transcriptional-translation feedback network of clock genes produces near 24 h oscillations in clock gene and protein expression. Here, we aim to identify novel additions to the clock network using a meta-analysis of public chromatin immunoprecipitation sequencing (ChIP-seq), proteomics and protein-protein interaction...

  20. A gene network simulator to assess reverse engineering algorithms.

    Science.gov (United States)

    Di Camillo, Barbara; Toffolo, Gianna; Cobelli, Claudio

    2009-03-01

    In the context of reverse engineering of biological networks, simulators are helpful to test and compare the accuracy of different reverse-engineering approaches in a variety of experimental conditions. A novel gene-network simulator is presented that resembles some of the main features of transcriptional regulatory networks related to topology, interaction among regulators of transcription, and expression dynamics. The simulator generates network topology according to the current knowledge of biological network organization, including scale-free distribution of the connectivity and clustering coefficient independent of the number of nodes in the network. It uses fuzzy logic to represent interactions among the regulators of each gene, integrated with differential equations to generate continuous data, comparable to real data for variety and dynamic complexity. Finally, the simulator accounts for saturation in the response to regulation and transcription activation thresholds and shows robustness to perturbations. It therefore provides a reliable and versatile test bed for reverse engineering algorithms applied to microarray data. Since the simulator describes regulatory interactions and expression dynamics as two distinct, although interconnected aspects of regulation, it can also be used to test reverse engineering approaches that use both microarray and protein-protein interaction data in the process of learning. A first software release is available at http://www.dei.unipd.it/~dicamill/software/netsim as an R programming language package.

  1. Generic Properties of Random Gene Regulatory Networks.

    Science.gov (United States)

    Li, Zhiyuan; Bianco, Simone; Zhang, Zhaoyang; Tang, Chao

    2013-12-01

    Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.

  2. Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

    Directory of Open Access Journals (Sweden)

    Matthew B Biggs

    2017-03-01

    Full Text Available Genome-scale metabolic network reconstructions (GENREs are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA. We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.

  3. Circuit-wide Transcriptional Profiling Reveals Brain Region-Specific Gene Networks Regulating Depression Susceptibility.

    Science.gov (United States)

    Bagot, Rosemary C; Cates, Hannah M; Purushothaman, Immanuel; Lorsch, Zachary S; Walker, Deena M; Wang, Junshi; Huang, Xiaojie; Schlüter, Oliver M; Maze, Ian; Peña, Catherine J; Heller, Elizabeth A; Issler, Orna; Wang, Minghui; Song, Won-Min; Stein, Jason L; Liu, Xiaochuan; Doyle, Marie A; Scobie, Kimberly N; Sun, Hao Sheng; Neve, Rachael L; Geschwind, Daniel; Dong, Yan; Shen, Li; Zhang, Bin; Nestler, Eric J

    2016-06-01

    Depression is a complex, heterogeneous disorder and a leading contributor to the global burden of disease. Most previous research has focused on individual brain regions and genes contributing to depression. However, emerging evidence in humans and animal models suggests that dysregulated circuit function and gene expression across multiple brain regions drive depressive phenotypes. Here, we performed RNA sequencing on four brain regions from control animals and those susceptible or resilient to chronic social defeat stress at multiple time points. We employed an integrative network biology approach to identify transcriptional networks and key driver genes that regulate susceptibility to depressive-like symptoms. Further, we validated in vivo several key drivers and their associated transcriptional networks that regulate depression susceptibility and confirmed their functional significance at the levels of gene transcription, synaptic regulation, and behavior. Our study reveals novel transcriptional networks that control stress susceptibility and offers fundamentally new leads for antidepressant drug discovery. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Population genomics of the Arabidopsis thaliana flowering time gene network.

    Science.gov (United States)

    Flowers, Jonathan M; Hanzawa, Yoshie; Hall, Megan C; Moore, Richard C; Purugganan, Michael D

    2009-11-01

    The time to flowering is a key component of the life-history strategy of the model plant Arabidopsis thaliana that varies quantitatively among genotypes. A significant problem for evolutionary and ecological genetics is to understand how natural selection may operate on this ecologically significant trait. Here, we conduct a population genomic study of resequencing data from 52 genes in the flowering time network. McDonald-Kreitman tests of neutrality suggested a strong excess of amino acid polymorphism when pooling across loci. This excess of replacement polymorphism across the flowering time network and a skewed derived frequency spectrum toward rare alleles for both replacement and noncoding polymorphisms relative to synonymous changes is consistent with a large class of deleterious polymorphisms segregating in these genes. Assuming selective neutrality of synonymous changes, we estimate that approximately 30% of amino acid polymorphisms are deleterious. Evidence of adaptive substitution is less prominent in our analysis. The photoperiod regulatory gene, CO, and a gibberellic acid transcription factor, AtMYB33, show evidence of adaptive fixation of amino acid mutations. A test for extended haplotypes revealed no examples of flowering time alleles with haplotypes comparable in length to those associated with the null fri(Col) allele reported previously. This suggests that the FRI gene likely has a uniquely intense or recent history of selection among the flowering time genes considered here. Although there is some evidence for adaptive evolution in these life-history genes, it appears that slightly deleterious polymorphisms are a major component of natural molecular variation in the flowering time network of A. thaliana.

  5. Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives

    International Nuclear Information System (INIS)

    Warmflash, Aryeh; Siggia, Eric D; Francois, Paul

    2012-01-01

    The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input–output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria. (paper)

  6. Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives.

    Science.gov (United States)

    Warmflash, Aryeh; Francois, Paul; Siggia, Eric D

    2012-10-01

    The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input-output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria.

  7. Network analysis of inflammatory genes and their transcriptional regulators in coronary artery disease.

    Directory of Open Access Journals (Sweden)

    Jiny Nair

    Full Text Available Network analysis is a novel method to understand the complex pathogenesis of inflammation-driven atherosclerosis. Using this approach, we attempted to identify key inflammatory genes and their core transcriptional regulators in coronary artery disease (CAD. Initially, we obtained 124 candidate genes associated with inflammation and CAD using Polysearch and CADgene database for which protein-protein interaction network was generated using STRING 9.0 (Search Tool for the Retrieval of Interacting Genes and visualized using Cytoscape v 2.8.3. Based on betweenness centrality (BC and node degree as key topological parameters, we identified interleukin-6 (IL-6, vascular endothelial growth factor A (VEGFA, interleukin-1 beta (IL-1B, tumor necrosis factor (TNF and prostaglandin-endoperoxide synthase 2 (PTGS2 as hub nodes. The backbone network constructed with these five hub genes showed 111 nodes connected via 348 edges, with IL-6 having the largest degree and highest BC. Nuclear factor kappa B1 (NFKB1, signal transducer and activator of transcription 3 (STAT3 and JUN were identified as the three core transcription factors from the regulatory network derived using MatInspector. For the purpose of validation of the hub genes, 97 test networks were constructed, which revealed the accuracy of the backbone network to be 0.7763 while the frequency of the hub nodes remained largely unaltered. Pathway enrichment analysis with ClueGO, KEGG and REACTOME showed significant enrichment of six validated CAD pathways - smooth muscle cell proliferation, acute-phase response, calcidiol 1-monooxygenase activity, toll-like receptor signaling, NOD-like receptor signaling and adipocytokine signaling pathways. Experimental verification of the above findings in 64 cases and 64 controls showed increased expression of the five candidate genes and the three transcription factors in the cases relative to the controls (p<0.05. Thus, analysis of complex networks aid in the

  8. On the role of sparseness in the evolution of modularity in gene regulatory networks.

    Science.gov (United States)

    Espinosa-Soto, Carlos

    2018-05-01

    Modularity is a widespread property in biological systems. It implies that interactions occur mainly within groups of system elements. A modular arrangement facilitates adjustment of one module without perturbing the rest of the system. Therefore, modularity of developmental mechanisms is a major factor for evolvability, the potential to produce beneficial variation from random genetic change. Understanding how modularity evolves in gene regulatory networks, that create the distinct gene activity patterns that characterize different parts of an organism, is key to developmental and evolutionary biology. One hypothesis for the evolution of modules suggests that interactions between some sets of genes become maladaptive when selection favours additional gene activity patterns. The removal of such interactions by selection would result in the formation of modules. A second hypothesis suggests that modularity evolves in response to sparseness, the scarcity of interactions within a system. Here I simulate the evolution of gene regulatory networks and analyse diverse experimentally sustained networks to study the relationship between sparseness and modularity. My results suggest that sparseness alone is neither sufficient nor necessary to explain modularity in gene regulatory networks. However, sparseness amplifies the effects of forms of selection that, like selection for additional gene activity patterns, already produce an increase in modularity. That evolution of new gene activity patterns is frequent across evolution also supports that it is a major factor in the evolution of modularity. That sparseness is widespread across gene regulatory networks indicates that it may have facilitated the evolution of modules in a wide variety of cases.

  9. Toxic Diatom Aldehydes Affect Defence Gene Networks in Sea Urchins.

    Directory of Open Access Journals (Sweden)

    Stefano Varrella

    Full Text Available Marine organisms possess a series of cellular strategies to counteract the negative effects of toxic compounds, including the massive reorganization of gene expression networks. Here we report the modulated dose-dependent response of activated genes by diatom polyunsaturated aldehydes (PUAs in the sea urchin Paracentrotus lividus. PUAs are secondary metabolites deriving from the oxidation of fatty acids, inducing deleterious effects on the reproduction and development of planktonic and benthic organisms that feed on these unicellular algae and with anti-cancer activity. Our previous results showed that PUAs target several genes, implicated in different functional processes in this sea urchin. Using interactomic Ingenuity Pathway Analysis we now show that the genes targeted by PUAs are correlated with four HUB genes, NF-κB, p53, δ-2-catenin and HIF1A, which have not been previously reported for P. lividus. We propose a working model describing hypothetical pathways potentially involved in toxic aldehyde stress response in sea urchins. This represents the first report on gene networks affected by PUAs, opening new perspectives in understanding the cellular mechanisms underlying the response of benthic organisms to diatom exposure.

  10. Using gene co-expression network analysis to predict biomarkers for chronic lymphocytic leukemia

    Directory of Open Access Journals (Sweden)

    Borlawsky Tara B

    2010-10-01

    Full Text Available Abstract Background Chronic lymphocytic leukemia (CLL is the most common adult leukemia. It is a highly heterogeneous disease, and can be divided roughly into indolent and progressive stages based on classic clinical markers. Immunoglobin heavy chain variable region (IgVH mutational status was found to be associated with patient survival outcome, and biomarkers linked to the IgVH status has been a focus in the CLL prognosis research field. However, biomarkers highly correlated with IgVH mutational status which can accurately predict the survival outcome are yet to be discovered. Results In this paper, we investigate the use of gene co-expression network analysis to identify potential biomarkers for CLL. Specifically we focused on the co-expression network involving ZAP70, a well characterized biomarker for CLL. We selected 23 microarray datasets corresponding to multiple types of cancer from the Gene Expression Omnibus (GEO and used the frequent network mining algorithm CODENSE to identify highly connected gene co-expression networks spanning the entire genome, then evaluated the genes in the co-expression network in which ZAP70 is involved. We then applied a set of feature selection methods to further select genes which are capable of predicting IgVH mutation status from the ZAP70 co-expression network. Conclusions We have identified a set of genes that are potential CLL prognostic biomarkers IL2RB, CD8A, CD247, LAG3 and KLRK1, which can predict CLL patient IgVH mutational status with high accuracies. Their prognostic capabilities were cross-validated by applying these biomarker candidates to classify patients into different outcome groups using a CLL microarray datasets with clinical information.

  11. Common gene-network signature of different neurological disorders and their potential implications to neuroAIDS.

    Directory of Open Access Journals (Sweden)

    Vidya Sagar

    Full Text Available The neurological complications of AIDS (neuroAIDS during the infection of human immunodeficiency virus (HIV are symptomized by non-specific, multifaceted neurological conditions and therefore, defining a specific diagnosis/treatment mechanism(s for this neuro-complexity at the molecular level remains elusive. Using an in silico based integrated gene network analysis we discovered that HIV infection shares convergent gene networks with each of twelve neurological disorders selected in this study. Importantly, a common gene network was identified among HIV infection, Alzheimer's disease, Parkinson's disease, multiple sclerosis, and age macular degeneration. An mRNA microarray analysis in HIV-infected monocytes showed significant changes in the expression of several genes of this in silico derived common pathway which suggests the possible physiological relevance of this gene-circuit in driving neuroAIDS condition. Further, this unique gene network was compared with another in silico derived novel, convergent gene network which is shared by seven major neurological disorders (Alzheimer's disease, Parkinson's disease, Multiple Sclerosis, Age Macular Degeneration, Amyotrophic Lateral Sclerosis, Vascular Dementia, and Restless Leg Syndrome. These networks differed in their gene circuits; however, in large, they involved innate immunity signaling pathways, which suggests commonalities in the immunological basis of different neuropathogenesis. The common gene circuits reported here can provide a prospective platform to understand how gene-circuits belonging to other neuro-disorders may be convoluted during real-time neuroAIDS condition and it may elucidate the underlying-and so far unknown-genetic overlap between HIV infection and neuroAIDS risk. Also, it may lead to a new paradigm in understanding disease progression, identifying biomarkers, and developing therapies.

  12. Reconstruction of neutron spectra using neural networks starting from the Bonner spheres spectrometric system

    International Nuclear Information System (INIS)

    Ortiz R, J.M.; Martinez B, M.R.; Arteaga A, T.; Vega C, H.R.; Hernandez D, V.M.; Manzanares A, E.

    2005-01-01

    The artificial neural networks (RN) have been used successfully to solve a wide variety of problems. However to determine an appropriate set of values of the structural parameters and of learning of these, it continues being even a difficult task. Contrary to previous works, here a set of neural networks is designed to reconstruct neutron spectra starting from the counting rates coming from the detectors of the Bonner spheres system, using a systematic and experimental strategy for the robust design of multilayer neural networks of the feed forward type of inverse propagation. The robust design is formulated as a design problem of Taguchi parameters. It was selected a set of 53 neutron spectra, compiled by the International Atomic Energy Agency, the counting rates were calculated that would take place in a Bonner spheres system, the set was arranged according to the wave form of those spectra. With these data and applying the Taguchi methodology to determine the best parameters of the network topology, it was trained and it proved the same one with the spectra. (Author)

  13. iCN718, an Updated and Improved Genome-Scale Metabolic Network Reconstruction of Acinetobacter baumannii AYE.

    Science.gov (United States)

    Norsigian, Charles J; Kavvas, Erol; Seif, Yara; Palsson, Bernhard O; Monk, Jonathan M

    2018-01-01

    Acinetobacter baumannii has become an urgent clinical threat due to the recent emergence of multi-drug resistant strains. There is thus a significant need to discover new therapeutic targets in this organism. One means for doing so is through the use of high-quality genome-scale reconstructions. Well-curated and accurate genome-scale models (GEMs) of A. baumannii would be useful for improving treatment options. We present an updated and improved genome-scale reconstruction of A. baumannii AYE, named iCN718, that improves and standardizes previous A. baumannii AYE reconstructions. iCN718 has 80% accuracy for predicting gene essentiality data and additionally can predict large-scale phenotypic data with as much as 89% accuracy, a new capability for an A. baumannii reconstruction. We further demonstrate that iCN718 can be used to analyze conserved metabolic functions in the A. baumannii core genome and to build strain-specific GEMs of 74 other A. baumannii strains from genome sequence alone. iCN718 will serve as a resource to integrate and synthesize new experimental data being generated for this urgent threat pathogen.

  14. ReTrOS: a MATLAB toolbox for reconstructing transcriptional activity from gene and protein expression data.

    Science.gov (United States)

    Minas, Giorgos; Momiji, Hiroshi; Jenkins, Dafyd J; Costa, Maria J; Rand, David A; Finkenstädt, Bärbel

    2017-06-26

    Given the development of high-throughput experimental techniques, an increasing number of whole genome transcription profiling time series data sets, with good temporal resolution, are becoming available to researchers. The ReTrOS toolbox (Reconstructing Transcription Open Software) provides MATLAB-based implementations of two related methods, namely ReTrOS-Smooth and ReTrOS-Switch, for reconstructing the temporal transcriptional activity profile of a gene from given mRNA expression time series or protein reporter time series. The methods are based on fitting a differential equation model incorporating the processes of transcription, translation and degradation. The toolbox provides a framework for model fitting along with statistical analyses of the model with a graphical interface and model visualisation. We highlight several applications of the toolbox, including the reconstruction of the temporal cascade of transcriptional activity inferred from mRNA expression data and protein reporter data in the core circadian clock in Arabidopsis thaliana, and how such reconstructed transcription profiles can be used to study the effects of different cell lines and conditions. The ReTrOS toolbox allows users to analyse gene and/or protein expression time series where, with appropriate formulation of prior information about a minimum of kinetic parameters, in particular rates of degradation, users are able to infer timings of changes in transcriptional activity. Data from any organism and obtained from a range of technologies can be used as input due to the flexible and generic nature of the model and implementation. The output from this software provides a useful analysis of time series data and can be incorporated into further modelling approaches or in hypothesis generation.

  15. CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks

    Directory of Open Access Journals (Sweden)

    Czaja Lisa F

    2006-02-01

    Full Text Available Abstract Background The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. Description CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. Conclusion CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.

  16. Deregulation of an imprinted gene network in prostate cancer.

    Science.gov (United States)

    Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A

    2014-05-01

    Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes.

  17. Ground rules of the pluripotency gene regulatory network.

    KAUST Repository

    Li, Mo

    2017-01-03

    Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.

  18. Ground rules of the pluripotency gene regulatory network.

    KAUST Repository

    Li, Mo; Belmonte, Juan Carlos Izpisua

    2017-01-01

    Pluripotency is a state that exists transiently in the early embryo and, remarkably, can be recapitulated in vitro by deriving embryonic stem cells or by reprogramming somatic cells to become induced pluripotent stem cells. The state of pluripotency, which is stabilized by an interconnected network of pluripotency-associated genes, integrates external signals and exerts control over the decision between self-renewal and differentiation at the transcriptional, post-transcriptional and epigenetic levels. Recent evidence of alternative pluripotency states indicates the regulatory flexibility of this network. Insights into the underlying principles of the pluripotency network may provide unprecedented opportunities for studying development and for regenerative medicine.

  19. A big data pipeline: Identifying dynamic gene regulatory networks from time-course Gene Expression Omnibus data with applications to influenza infection.

    Science.gov (United States)

    Carey, Michelle; RamĂ­rez, Juan Camilo; Wu, Shuang; Wu, Hulin

    2018-07-01

    A biological host response to an external stimulus or intervention such as a disease or infection is a dynamic process, which is regulated by an intricate network of many genes and their products. Understanding the dynamics of this gene regulatory network allows us to infer the mechanisms involved in a host response to an external stimulus, and hence aids the discovery of biomarkers of phenotype and biological function. In this article, we propose a modeling/analysis pipeline for dynamic gene expression data, called Pipeline4DGEData, which consists of a series of statistical modeling techniques to construct dynamic gene regulatory networks from the large volumes of high-dimensional time-course gene expression data that are freely available in the Gene Expression Omnibus repository. This pipeline has a consistent and scalable structure that allows it to simultaneously analyze a large number of time-course gene expression data sets, and then integrate the results across different studies. We apply the proposed pipeline to influenza infection data from nine studies and demonstrate that interesting biological findings can be discovered with its implementation.

  20. Memory functions reveal structural properties of gene regulatory networks

    Science.gov (United States)

    Perez-Carrasco, Ruben

    2018-01-01

    Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs. PMID:29470492

  1. Determining Regulatory Networks Governing the Differentiation of Embryonic Stem Cells to Pancreatic Lineage

    Science.gov (United States)

    Banerjee, Ipsita

    2009-03-01

    Knowledge of pathways governing cellular differentiation to specific phenotype will enable generation of desired cell fates by careful alteration of the governing network by adequate manipulation of the cellular environment. With this aim, we have developed a novel method to reconstruct the underlying regulatory architecture of a differentiating cell population from discrete temporal gene expression data. We utilize an inherent feature of biological networks, that of sparsity, in formulating the network reconstruction problem as a bi-level mixed-integer programming problem. The formulation optimizes the network topology at the upper level and the network connectivity strength at the lower level. The method is first validated by in-silico data, before applying it to the complex system of embryonic stem (ES) cell differentiation. This formulation enables efficient identification of the underlying network topology which could accurately predict steps necessary for directing differentiation to subsequent stages. Concurrent experimental verification demonstrated excellent agreement with model prediction.

  2. An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

    Science.gov (United States)

    BotĂ­a, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E

    2017-04-12

    Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.

  3. Variable disparity estimation based intermediate view reconstruction in dynamic flow allocation over EPON-based access networks

    Science.gov (United States)

    Bae, Kyung-Hoon; Lee, Jungjoon; Kim, Eun-Soo

    2008-06-01

    In this paper, a variable disparity estimation (VDE)-based intermediate view reconstruction (IVR) in dynamic flow allocation (DFA) over an Ethernet passive optical network (EPON)-based access network is proposed. In the proposed system, the stereoscopic images are estimated by a variable block-matching algorithm (VBMA), and they are transmitted to the receiver through DFA over EPON. This scheme improves a priority-based access network by converting it to a flow-based access network with a new access mechanism and scheduling algorithm, and then 16-view images are synthesized by the IVR using VDE. Some experimental results indicate that the proposed system improves the peak-signal-to-noise ratio (PSNR) to as high as 4.86 dB and reduces the processing time to 3.52 s. Additionally, the network service provider can provide upper limits of transmission delays by the flow. The modeling and simulation results, including mathematical analyses, from this scheme are also provided.

  4. Chronic obstructive pulmonary disease candidate gene prioritization based on metabolic networks and functional information.

    Directory of Open Access Journals (Sweden)

    Xinyan Wang

    Full Text Available Chronic obstructive pulmonary disease (COPD is a multi-factor disease, in which metabolic disturbances played important roles. In this paper, functional information was integrated into a COPD-related metabolic network to assess similarity between genes. Then a gene prioritization method was applied to the COPD-related metabolic network to prioritize COPD candidate genes. The gene prioritization method was superior to ToppGene and ToppNet in both literature validation and functional enrichment analysis. Top-ranked genes prioritized from the metabolic perspective with functional information could promote the better understanding about the molecular mechanism of this disease. Top 100 genes might be potential markers for diagnostic and effective therapies.

  5. Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

    Science.gov (United States)

    Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

    2014-12-01

    Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).

  6. High resolution depth reconstruction from monocular images and sparse point clouds using deep convolutional neural network

    Science.gov (United States)

    Dimitrievski, Martin; Goossens, Bart; Veelaert, Peter; Philips, Wilfried

    2017-09-01

    Understanding the 3D structure of the environment is advantageous for many tasks in the field of robotics and autonomous vehicles. From the robot's point of view, 3D perception is often formulated as a depth image reconstruction problem. In the literature, dense depth images are often recovered deterministically from stereo image disparities. Other systems use an expensive LiDAR sensor to produce accurate, but semi-sparse depth images. With the advent of deep learning there have also been attempts to estimate depth by only using monocular images. In this paper we combine the best of the two worlds, focusing on a combination of monocular images and low cost LiDAR point clouds. We explore the idea that very sparse depth information accurately captures the global scene structure while variations in image patches can be used to reconstruct local depth to a high resolution. The main contribution of this paper is a supervised learning depth reconstruction system based on a deep convolutional neural network. The network is trained on RGB image patches reinforced with sparse depth information and the output is a depth estimate for each pixel. Using image and point cloud data from the KITTI vision dataset we are able to learn a correspondence between local RGB information and local depth, while at the same time preserving the global scene structure. Our results are evaluated on sequences from the KITTI dataset and our own recordings using a low cost camera and LiDAR setup.

  7. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

    Directory of Open Access Journals (Sweden)

    Emre Guney

    Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.

  8. Structural influence of gene networks on their inference: analysis of C3NET

    Directory of Open Access Journals (Sweden)

    Emmert-Streib Frank

    2011-06-01

    Full Text Available Abstract Background The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. Results In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. Conclusions The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods. Reviewers This article was reviewed by Lev Klebanov, Joel Bader and Yuriy Gusev.

  9. Listening to the Noise: Random Fluctuations Reveal Gene Network Parameters

    Science.gov (United States)

    Munsky, Brian; Trinh, Brooke; Khammash, Mustafa

    2010-03-01

    The cellular environment is abuzz with noise originating from the inherent random motion of reacting molecules in the living cell. In this noisy environment, clonal cell populations exhibit cell-to-cell variability that can manifest significant prototypical differences. Noise induced stochastic fluctuations in cellular constituents can be measured and their statistics quantified using flow cytometry, single molecule fluorescence in situ hybridization, time lapse fluorescence microscopy and other single cell and single molecule measurement techniques. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich source of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. We use theoretical investigations to establish experimental guidelines for the identification of gene regulatory networks, and we apply these guideline to experimentally identify predictive models for different regulatory mechanisms in bacteria and yeast.

  10. A systems approach identifies networks and genes linking sleep and stress: implications for neuropsychiatric disorders.

    Science.gov (United States)

    Jiang, Peng; Scarpa, Joseph R; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D; Hao, Ke; Summa, Keith C; Yang, He S; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew

    2015-05-05

    Sleep dysfunction and stress susceptibility are comorbid complex traits that often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multilevel organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J × A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type-specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests that the interplay among sleep, stress, and neuropathology emerges from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework for interrogating the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Parallel logic gates in synthetic gene networks induced by non-Gaussian noise.

    Science.gov (United States)

    Xu, Yong; Jin, Xiaoqin; Zhang, Huiqing

    2013-11-01

    The recent idea of logical stochastic resonance is verified in synthetic gene networks induced by non-Gaussian noise. We realize the switching between two kinds of logic gates under optimal moderate noise intensity by varying two different tunable parameters in a single gene network. Furthermore, in order to obtain more logic operations, thus providing additional information processing capacity, we obtain in a two-dimensional toggle switch model two complementary logic gates and realize the transformation between two logic gates via the methods of changing different parameters. These simulated results contribute to improve the computational power and functionality of the networks.

  12. Genes and Gene Networks Involved in Sodium Fluoride-Elicited Cell Death Accompanying Endoplasmic Reticulum Stress in Oral Epithelial Cells

    Directory of Open Access Journals (Sweden)

    Yoshiaki Tabuchi

    2014-05-01

    Full Text Available Here, to understand the molecular mechanisms underlying cell death induced by sodium fluoride (NaF, we analyzed gene expression patterns in rat oral epithelial ROE2 cells exposed to NaF using global-scale microarrays and bioinformatics tools. A relatively high concentration of NaF (2 mM induced cell death concomitant with decreases in mitochondrial membrane potential, chromatin condensation and caspase-3 activation. Using 980 probe sets, we identified 432 up-regulated and 548 down-regulated genes, that were differentially expressed by >2.5-fold in the cells treated with 2 mM of NaF and categorized them into 4 groups by K-means clustering. Ingenuity® pathway analysis revealed several gene networks from gene clusters. The gene networks Up-I and Up-II included many up-regulated genes that were mainly associated with the biological function of induction or prevention of cell death, respectively, such as Atf3, Ddit3 and Fos (for Up-I and Atf4 and Hspa5 (for Up-II. Interestingly, knockdown of Ddit3 and Hspa5 significantly increased and decreased the number of viable cells, respectively. Moreover, several endoplasmic reticulum (ER stress-related genes including, Ddit3, Atf4 and Hapa5, were observed in these gene networks. These findings will provide further insight into the molecular mechanisms of NaF-induced cell death accompanying ER stress in oral epithelial cells.

  13. Integration of heterogeneous molecular networks to unravel gene-regulation in Mycobacterium tuberculosis.

    Science.gov (United States)

    van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María

    2014-09-26

    Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different

  14. Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.

    Science.gov (United States)

    Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju

    2017-04-27

    Our understanding of the molecular mechanisms underlying Alzheimer's disease (AD) remains incomplete. Previous studies have revealed that genetic factors provide a significant contribution to the pathogenesis and development of AD. In the past years, numerous genes implicated in this disease have been identified via genetic association studies on candidate genes or at the genome-wide level. However, in many cases, the roles of these genes and their interactions in AD are still unclear. A comprehensive and systematic analysis focusing on the biological function and interactions of these genes in the context of AD will therefore provide valuable insights to understand the molecular features of the disease. In this study, we collected genes potentially associated with AD by screening publications on genetic association studies deposited in PubMed. The major biological themes linked with these genes were then revealed by function and biochemical pathway enrichment analysis, and the relation between the pathways was explored by pathway crosstalk analysis. Furthermore, the network features of these AD-related genes were analyzed in the context of human interactome and an AD-specific network was inferred using the Steiner minimal tree algorithm. We compiled 430 human genes reported to be associated with AD from 823 publications. Biological theme analysis indicated that the biological processes and biochemical pathways related to neurodevelopment, metabolism, cell growth and/or survival, and immunology were enriched in these genes. Pathway crosstalk analysis then revealed that the significantly enriched pathways could be grouped into three interlinked modules-neuronal and metabolic module, cell growth/survival and neuroendocrine pathway module, and immune response-related module-indicating an AD-specific immune-endocrine-neuronal regulatory network. Furthermore, an AD-specific protein network was inferred and novel genes potentially associated with AD were identified. By

  15. Metabolic Network Topology Reveals Transcriptional Regulatory Signatures of Type 2 Diabetes

    DEFF Research Database (Denmark)

    Zelezniak, Aleksej; Pers, Tune Hannes; Pinho Soares, Simao Pedro

    2010-01-01

    mechanisms underlying these transcriptional changes and their impact on the cellular metabolic phenotype is a challenging task due to the complexity of transcriptional regulation and the highly interconnected nature of the metabolic network. In this study we integrate skeletal muscle gene expression datasets...... with human metabolic network reconstructions to identify key metabolic regulatory features of T2DM. These features include reporter metabolites—metabolites with significant collective transcriptional response in the associated enzyme-coding genes, and transcription factors with significant enrichment...... factor regulatory network connecting several parts of metabolism. The identified transcription factors include members of the CREB, NRF1 and PPAR family, among others, and represent regulatory targets for further experimental analysis. Overall, our results provide a holistic picture of key metabolic...

  16. Occluded object reconstruction for first responders with augmented reality glasses using conditional generative adversarial networks

    OpenAIRE

    Yun, Kyongsik; Lu, Thomas; Chow, Edward

    2018-01-01

    Firefighters suffer a variety of life-threatening risks, including line-of-duty deaths, injuries, and exposures to hazardous substances. Support for reducing these risks is important. We built a partially occluded object reconstruction method on augmented reality glasses for first responders. We used a deep learning based on conditional generative adversarial networks to train associations between the various images of flammable and hazardous objects and their partially occluded counterparts....

  17. Integrative analysis for finding genes and networks involved in diabetes and other complex diseases

    DEFF Research Database (Denmark)

    Bergholdt, R.; Størling, Zenia, Marian; Hansen, Kasper Lage

    2007-01-01

    We have developed an integrative analysis method combining genetic interactions, identified using type 1 diabetes genome scan data, and a high-confidence human protein interaction network. Resulting networks were ranked by the significance of the enrichment of proteins from interacting regions. We...... identified a number of new protein network modules and novel candidate genes/proteins for type 1 diabetes. We propose this type of integrative analysis as a general method for the elucidation of genes and networks involved in diabetes and other complex diseases....

  18. Comparison of evolutionary algorithms in gene regulatory network model inference.

    LENUS (Irish Health Repository)

    2010-01-01

    ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.

  19. Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

    Science.gov (United States)

    Hur, Junguk; Ă–zgĂĽr, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2015-01-01

    Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these

  20. Fractal gene regulatory networks for robust locomotion control of modular robots

    DEFF Research Database (Denmark)

    Zahadat, Payam; Christensen, David Johan; Schultz, Ulrik Pagh

    2010-01-01

    Designing controllers for modular robots is difficult due to the distributed and dynamic nature of the robots. In this paper fractal gene regulatory networks are evolved to control modular robots in a distributed way. Experiments with different morphologies of modular robot are performed and the ......Designing controllers for modular robots is difficult due to the distributed and dynamic nature of the robots. In this paper fractal gene regulatory networks are evolved to control modular robots in a distributed way. Experiments with different morphologies of modular robot are performed...

  1. Effects of threshold on the topology of gene co-expression networks.

    Science.gov (United States)

    Couto, Cynthia Martins Villar; Comin, CĂ©sar Henrique; Costa, Luciano da Fontoura

    2017-09-26

    Several developments regarding the analysis of gene co-expression profiles using complex network theory have been reported recently. Such approaches usually start with the construction of an unweighted gene co-expression network, therefore requiring the selection of a suitable threshold defining which pairs of vertices will be connected. We aimed at addressing such an important problem by suggesting and comparing five different approaches for threshold selection. Each of the methods considers a respective biologically-motivated criterion for electing a potentially suitable threshold. A set of 21 microarray experiments from different biological groups was used to investigate the effect of applying the five proposed criteria to several biological situations. For each experiment, we used the Pearson correlation coefficient to measure the relationship between each gene pair, and the resulting weight matrices were thresholded considering several values, generating respective adjacency matrices (co-expression networks). Each of the five proposed criteria was then applied in order to select the respective threshold value. The effects of these thresholding approaches on the topology of the resulting networks were compared by using several measurements, and we verified that, depending on the database, the impact on the topological properties can be large. However, a group of databases was verified to be similarly affected by most of the considered criteria. Based on such results, it can be suggested that when the generated networks present similar measurements, the thresholding method can be chosen with greater freedom. If the generated networks are markedly different, the thresholding method that better suits the interests of each specific research study represents a reasonable choice.

  2. Topological and organizational properties of the products of house-keeping and tissue-specific genes in protein-protein interaction networks.

    Science.gov (United States)

    Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing

    2009-03-11

    Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene

  3. A systems biology approach to construct the gene regulatory network of systemic inflammation via microarray and databases mining

    Directory of Open Access Journals (Sweden)

    Lan Chung-Yu

    2008-09-01

    Full Text Available Abstract Background Inflammation is a hallmark of many human diseases. Elucidating the mechanisms underlying systemic inflammation has long been an important topic in basic and clinical research. When primary pathogenetic events remains unclear due to its immense complexity, construction and analysis of the gene regulatory network of inflammation at times becomes the best way to understand the detrimental effects of disease. However, it is difficult to recognize and evaluate relevant biological processes from the huge quantities of experimental data. It is hence appealing to find an algorithm which can generate a gene regulatory network of systemic inflammation from high-throughput genomic studies of human diseases. Such network will be essential for us to extract valuable information from the complex and chaotic network under diseased conditions. Results In this study, we construct a gene regulatory network of inflammation using data extracted from the Ensembl and JASPAR databases. We also integrate and apply a number of systematic algorithms like cross correlation threshold, maximum likelihood estimation method and Akaike Information Criterion (AIC on time-lapsed microarray data to refine the genome-wide transcriptional regulatory network in response to bacterial endotoxins in the context of dynamic activated genes, which are regulated by transcription factors (TFs such as NF-ÎşB. This systematic approach is used to investigate the stochastic interaction represented by the dynamic leukocyte gene expression profiles of human subject exposed to an inflammatory stimulus (bacterial endotoxin. Based on the kinetic parameters of the dynamic gene regulatory network, we identify important properties (such as susceptibility to infection of the immune system, which may be useful for translational research. Finally, robustness of the inflammatory gene network is also inferred by analyzing the hubs and "weak ties" structures of the gene network

  4. A protocol for generating a high-quality genome-scale metabolic reconstruction.

    Science.gov (United States)

    Thiele, Ines; Palsson, Bernhard Ă

    2010-01-01

    Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.

  5. Aboveground Biomass Estimation Using Reconstructed Feature of Airborne Discrete-Return LIDAR by Auto-Encoder Neural Network

    Science.gov (United States)

    Li, T.; Wang, Z.; Peng, J.

    2018-04-01

    Aboveground biomass (AGB) estimation is critical for quantifying carbon stocks and essential for evaluating carbon cycle. In recent years, airborne LiDAR shows its great ability for highly-precision AGB estimation. Most of the researches estimate AGB by the feature metrics extracted from the canopy height distribution of the point cloud which calculated based on precise digital terrain model (DTM). However, if forest canopy density is high, the probability of the LiDAR signal penetrating the canopy is lower, resulting in ground points is not enough to establish DTM. Then the distribution of forest canopy height is imprecise and some critical feature metrics which have a strong correlation with biomass such as percentiles, maximums, means and standard deviations of canopy point cloud can hardly be extracted correctly. In order to address this issue, we propose a strategy of first reconstructing LiDAR feature metrics through Auto-Encoder neural network and then using the reconstructed feature metrics to estimate AGB. To assess the prediction ability of the reconstructed feature metrics, both original and reconstructed feature metrics were regressed against field-observed AGB using the multiple stepwise regression (MS) and the partial least squares regression (PLS) respectively. The results showed that the estimation model using reconstructed feature metrics improved R2 by 5.44 %, 18.09 %, decreased RMSE value by 10.06 %, 22.13 % and reduced RMSEcv by 10.00 %, 21.70 % for AGB, respectively. Therefore, reconstructing LiDAR point feature metrics has potential for addressing AGB estimation challenge in dense canopy area.

  6. Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.

    Science.gov (United States)

    Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M

    2017-01-01

    Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms

  7. VarWalker: personalized mutation network analysis of putative cancer genes from next-generation sequencing data.

    Science.gov (United States)

    Jia, Peilin; Zhao, Zhongming

    2014-02-01

    A major challenge in interpreting the large volume of mutation data identified by next-generation sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations to facilitate the identification of targetable genes and new drugs. Current approaches are primarily based on mutation frequencies of single-genes, which lack the power to detect infrequently mutated driver genes and ignore functional interconnection and regulation among cancer genes. We propose a novel mutation network method, VarWalker, to prioritize driver genes in large scale cancer mutation data. VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the joint frequency of both mutation genes and their close interactors. These interactors are selected and optimized using the Random Walk with Restart algorithm in a protein-protein interaction network. We applied the method in >300 tumor genomes in two large-scale NGS benchmark datasets: 183 lung adenocarcinoma samples and 121 melanoma samples. In each cancer, we derived a consensus mutation subnetwork containing significantly enriched consensus cancer genes and cancer-related functional pathways. These cancer-specific mutation networks were then validated using independent datasets for each cancer. Importantly, VarWalker prioritizes well-known, infrequently mutated genes, which are shown to interact with highly recurrently mutated genes yet have been ignored by conventional single-gene-based approaches. Utilizing VarWalker, we demonstrated that network-assisted approaches can be effectively adapted to facilitate the detection of cancer driver genes in NGS data.

  8. Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana

    KAUST Repository

    Muraro, D.

    2013-01-01

    Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation. © 2004-2012 IEEE.

  9. A Preliminary List of Horizontally Transferred Genes in Prokaryotes Determined by Tree Reconstruction and Reconciliation

    Directory of Open Access Journals (Sweden)

    Hyeonsoo Jeong

    2017-08-01

    Full Text Available Genome-wide global detection of genes involved in horizontal gene transfer (HGT remains an active area of research in medical microbiology and evolutionary genomics. Utilizing the explicit evolutionary method of comparing topologies of a total of 154,805 orthologous gene trees against corresponding 16S rRNA “reference” trees, we previously detected a total of 660,894 candidate HGT events in 2,472 completely-sequenced prokaryotic genomes. Here, we report an HGT-index for each individual gene-reference tree pair reconciliation, representing the total number of detected HGT events on the gene tree divided by the total number of genomes (taxa member of that tree. HGT-index is thus a simple measure indicating the sensitivity of prokaryotic genes to participate (or not participate in HGT. Our preliminary list provides HGT-indices for a total of 69,365 genes (detected in >10 and <50% available prokaryotic genomes that are involved in a wide range of biological processes such as metabolism, information, and bacterial response to environment. Identification of horizontally-derived genes is important to combat antibiotic resistance and is a step forward toward reconstructions of improved phylogenies describing the history of life. Our effort is thus expected to benefit ongoing research in the fields of clinical microbiology and evolutionary biology.

  10. Co-expression networks reveal the tissue-specific regulation of transcription and splicing.

    Science.gov (United States)

    Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H; Jo, Brian; Gao, Chuan; McDowell, Ian C; Engelhardt, Barbara E; Battle, Alexis

    2017-11-01

    Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. © 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Portrait of Candida Species Biofilm Regulatory Network Genes.

    Science.gov (United States)

    AraĂşjo, Daniela; Henriques, Mariana; Silva, SĂłnia

    2017-01-01

    Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Integration of multiple networks and pathways identifies cancer driver genes in pan-cancer analysis.

    Science.gov (United States)

    Cava, Claudia; Bertoli, Gloria; Colaprico, Antonio; Olsen, Catharina; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-06

    Modern high-throughput genomic technologies represent a comprehensive hallmark of molecular changes in pan-cancer studies. Although different cancer gene signatures have been revealed, the mechanism of tumourigenesis has yet to be completely understood. Pathways and networks are important tools to explain the role of genes in functional genomic studies. However, few methods consider the functional non-equal roles of genes in pathways and the complex gene-gene interactions in a network. We present a novel method in pan-cancer analysis that identifies de-regulated genes with a functional role by integrating pathway and network data. A pan-cancer analysis of 7158 tumour/normal samples from 16 cancer types identified 895 genes with a central role in pathways and de-regulated in cancer. Comparing our approach with 15 current tools that identify cancer driver genes, we found that 35.6% of the 895 genes identified by our method have been found as cancer driver genes with at least 2/15 tools. Finally, we applied a machine learning algorithm on 16 independent GEO cancer datasets to validate the diagnostic role of cancer driver genes for each cancer. We obtained a list of the top-ten cancer driver genes for each cancer considered in this study. Our analysis 1) confirmed that there are several known cancer driver genes in common among different types of cancer, 2) highlighted that cancer driver genes are able to regulate crucial pathways.

  13. Potential energy landscape and robustness of a gene regulatory network: toggle switch.

    Directory of Open Access Journals (Sweden)

    Keun-Young Kim

    2007-03-01

    Full Text Available Finding a multidimensional potential landscape is the key for addressing important global issues, such as the robustness of cellular networks. We have uncovered the underlying potential energy landscape of a simple gene regulatory network: a toggle switch. This was realized by explicitly constructing the steady state probability of the gene switch in the protein concentration space in the presence of the intrinsic statistical fluctuations due to the small number of proteins in the cell. We explored the global phase space for the system. We found that the protein synthesis rate and the unbinding rate of proteins to the gene were small relative to the protein degradation rate; the gene switch is monostable with only one stable basin of attraction. When both the protein synthesis rate and the unbinding rate of proteins to the gene are large compared with the protein degradation rate, two global basins of attraction emerge for a toggle switch. These basins correspond to the biologically stable functional states. The potential energy barrier between the two basins determines the time scale of conversion from one to the other. We found as the protein synthesis rate and protein unbinding rate to the gene relative to the protein degradation rate became larger, the potential energy barrier became larger. This also corresponded to systems with less noise or the fluctuations on the protein numbers. It leads to the robustness of the biological basins of the gene switches. The technique used here is general and can be applied to explore the potential energy landscape of the gene networks.

  14. rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

    Science.gov (United States)

    Guo, Liyuan; Wang, Jing

    2018-01-04

    Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Ganoderma lucidum polysaccharides in human monocytic leukemia cells: from gene expression to network construction

    Directory of Open Access Journals (Sweden)

    Ou Chern-Han

    2007-11-01

    Full Text Available Abstract Background Ganoderma lucidum has been widely used as a herbal medicine for promoting health and longevity in China and other Asian countries. Polysaccharide extracts from Ganoderma lucidum have been reported to exhibit immuno-modulating and anti-tumor activities. In previous studies, F3, the active component of the polysaccharide extract, was found to activate various cytokines such as IL-1, IL-6, IL-12, and TNF-α. This gave rise to our investigation on how F3 stimulates immuno-modulating or anti-tumor effects in human leukemia THP-1 cells. Results Here, we integrated time-course DNA microarray analysis, quantitative PCR assays, and bioinformatics methods to study the F3-induced effects in THP-1 cells. Significantly disturbed pathways induced by F3 were identified with statistical analysis on microarray data. The apoptosis induction through the DR3 and DR4/5 death receptors was found to be one of the most significant pathways and play a key role in THP-1 cells after F3 treatment. Based on time-course gene expression measurements of the identified pathway, we reconstructed a plausible regulatory network of the involved genes using reverse-engineering computational approach. Conclusion Our results showed that F3 may induce death receptor ligands to initiate signaling via receptor oligomerization, recruitment of specialized adaptor proteins and activation of caspase cascades.

  16. Ganoderma lucidum polysaccharides in human monocytic leukemia cells: from gene expression to network construction.

    Science.gov (United States)

    Cheng, Kun-Chieh; Huang, Hsuan-Cheng; Chen, Jenn-Han; Hsu, Jia-Wei; Cheng, Hsu-Chieh; Ou, Chern-Han; Yang, Wen-Bin; Chen, Shui-Tein; Wong, Chi-Huey; Juan, Hsueh-Fen

    2007-11-09

    Ganoderma lucidum has been widely used as a herbal medicine for promoting health and longevity in China and other Asian countries. Polysaccharide extracts from Ganoderma lucidum have been reported to exhibit immuno-modulating and anti-tumor activities. In previous studies, F3, the active component of the polysaccharide extract, was found to activate various cytokines such as IL-1, IL-6, IL-12, and TNF-alpha. This gave rise to our investigation on how F3 stimulates immuno-modulating or anti-tumor effects in human leukemia THP-1 cells. Here, we integrated time-course DNA microarray analysis, quantitative PCR assays, and bioinformatics methods to study the F3-induced effects in THP-1 cells. Significantly disturbed pathways induced by F3 were identified with statistical analysis on microarray data. The apoptosis induction through the DR3 and DR4/5 death receptors was found to be one of the most significant pathways and play a key role in THP-1 cells after F3 treatment. Based on time-course gene expression measurements of the identified pathway, we reconstructed a plausible regulatory network of the involved genes using reverse-engineering computational approach. Our results showed that F3 may induce death receptor ligands to initiate signaling via receptor oligomerization, recruitment of specialized adaptor proteins and activation of caspase cascades.

  17. Developmental evolution in social insects: regulatory networks from genes to societies.

    Science.gov (United States)

    Linksvayer, Timothy A; Fewell, Jennifer H; Gadau, JĂĽrgen; Laubichler, Manfred D

    2012-05-01

    The evolution and development of complex phenotypes in social insect colonies, such as queen-worker dimorphism or division of labor, can, in our opinion, only be fully understood within an expanded mechanistic framework of Developmental Evolution. Conversely, social insects offer a fertile research area in which fundamental questions of Developmental Evolution can be addressed empirically. We review the concept of gene regulatory networks (GRNs) that aims to fully describe the battery of interacting genomic modules that are differentially expressed during the development of individual organisms. We discuss how distinct types of network models have been used to study different levels of biological organization in social insects, from GRNs to social networks. We propose that these hierarchical networks spanning different organizational levels from genes to societies should be integrated and incorporated into full GRN models to elucidate the evolutionary and developmental mechanisms underlying social insect phenotypes. Finally, we discuss prospects and approaches to achieve such an integration. © 2012 WILEY PERIODICALS, INC.

  18. Comparison of Generated Parallel Capillary Arrays to Three-Dimensional Reconstructed Capillary Networks in Modeling Oxygen Transport in Discrete Microvascular Volumes

    Science.gov (United States)

    Fraser, Graham M.; Goldman, Daniel; Ellis, Christopher G.

    2013-01-01

    Objective We compare Reconstructed Microvascular Networks (RMN) to Parallel Capillary Arrays (PCA) under several simulated physiological conditions to determine how the use of different vascular geometry affects oxygen transport solutions. Methods Three discrete networks were reconstructed from intravital video microscopy of rat skeletal muscle (84×168×342 μm, 70×157×268 μm and 65×240×571 μm) and hemodynamic measurements were made in individual capillaries. PCAs were created based on statistical measurements from RMNs. Blood flow and O2 transport models were applied and the resulting solutions for RMN and PCA models were compared under 4 conditions (rest, exercise, ischemia and hypoxia). Results Predicted tissue PO2 was consistently lower in all RMN simulations compared to the paired PCA. PO2 for 3D reconstructions at rest were 28.2±4.8, 28.1±3.5, and 33.0±4.5 mmHg for networks I, II, and III compared to the PCA mean values of 31.2±4.5, 30.6±3.4, and 33.8±4.6 mmHg. Simulated exercise yielded mean tissue PO2 in the RMN of 10.1±5.4, 12.6±5.7, and 19.7±5.7 mmHg compared to 15.3±7.3, 18.8±5.3, and 21.7±6.0 in PCA. Conclusions These findings suggest that volume matched PCA yield different results compared to reconstructed microvascular geometries when applied to O2 transport modeling; the predominant characteristic of this difference being an over estimate of mean tissue PO2. Despite this limitation, PCA models remain important for theoretical studies as they produce PO2 distributions with similar shape and parameter dependence as RMN. PMID:23841679

  19. Gene Expression Networks in the Murine Pulmonary Myocardium Provide Insight into the Pathobiology of Atrial Fibrillation

    Directory of Open Access Journals (Sweden)

    Jordan K. Boutilier

    2017-09-01

    Full Text Available The pulmonary myocardium is a muscular coat surrounding the pulmonary and caval veins. Although its definitive physiological function is unknown, it may have a pathological role as the source of ectopic beats initiating atrial fibrillation. How the pulmonary myocardium gains pacemaker function is not clearly defined, although recent evidence indicates that changed transcriptional gene expression networks are at fault. The gene expression profile of this distinct cell type in situ was examined to investigate underlying molecular events that might contribute to atrial fibrillation. Via systems genetics, a whole-lung transcriptome data set from the BXD recombinant inbred mouse resource was analyzed, uncovering a pulmonary cardiomyocyte gene network of 24 transcripts, coordinately regulated by chromosome 1 and 2 loci. Promoter enrichment analysis and interrogation of publicly available ChIP-seq data suggested that transcription of this gene network may be regulated by the concerted activity of NKX2-5, serum response factor, myocyte enhancer factor 2, and also, at a post-transcriptional level, by RNA binding protein motif 20. Gene ontology terms indicate that this gene network overlaps with molecular markers of the stressed heart. Therefore, we propose that perturbed regulation of this gene network might lead to altered calcium handling, myocyte growth, and contractile force contributing to the aberrant electrophysiological properties observed in atrial fibrillation. We reveal novel molecular interactions and pathways representing possible therapeutic targets for atrial fibrillation. In addition, we highlight the utility of recombinant inbred mouse resources in detecting and characterizing gene expression networks of relatively small populations of cells that have a pathological significance.

  20. Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.

    Science.gov (United States)

    Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A

    2017-08-07

    High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier

  1. Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation.

    Science.gov (United States)

    Li, Min; Zhang, Jiayi; Liu, Qing; Wang, Jianxin; Wu, Fang-Xiang

    2014-01-01

    Predicting disease-related genes is one of the most important tasks in bioinformatics and systems biology. With the advances in high-throughput techniques, a large number of protein-protein interactions are available, which make it possible to identify disease-related genes at the network level. However, network-based identification of disease-related genes is still a challenge as the considerable false-positives are still existed in the current available protein interaction networks (PIN). Considering the fact that the majority of genetic disorders tend to manifest only in a single or a few tissues, we constructed tissue-specific networks (TSN) by integrating PIN and tissue-specific data. We further weighed the constructed tissue-specific network (WTSN) by using DNA methylation as it plays an irreplaceable role in the development of complex diseases. A PageRank-based method was developed to identify disease-related genes from the constructed networks. To validate the effectiveness of the proposed method, we constructed PIN, weighted PIN (WPIN), TSN, WTSN for colon cancer and leukemia, respectively. The experimental results on colon cancer and leukemia show that the combination of tissue-specific data and DNA methylation can help to identify disease-related genes more accurately. Moreover, the PageRank-based method was effective to predict disease-related genes on the case studies of colon cancer and leukemia. Tissue-specific data and DNA methylation are two important factors to the study of human diseases. The same method implemented on the WTSN can achieve better results compared to those being implemented on original PIN, WPIN, or TSN. The PageRank-based method outperforms degree centrality-based method for identifying disease-related genes from WTSN.

  2. Pseudo-proxy evaluation of climate field reconstruction methods of North Atlantic climate based on an annually resolved marine proxy network

    Directory of Open Access Journals (Sweden)

    M. Pyrina

    2017-10-01

    Full Text Available Two statistical methods are tested to reconstruct the interannual variations in past sea surface temperatures (SSTs of the North Atlantic (NA Ocean over the past millennium based on annually resolved and absolutely dated marine proxy records of the bivalve mollusk Arctica islandica. The methods are tested in a pseudo-proxy experiment (PPE setup using state-of-the-art climate models (CMIP5 Earth system models and reanalysis data from the COBE2 SST data set. The methods were applied in the virtual reality provided by global climate simulations and reanalysis data to reconstruct the past NA SSTs using pseudo-proxy records that mimic the statistical characteristics and network of Arctica islandica. The multivariate linear regression methods evaluated here are principal component regression and canonical correlation analysis. Differences in the skill of the climate field reconstruction (CFR are assessed according to different calibration periods and different proxy locations within the NA basin. The choice of the climate model used as a surrogate reality in the PPE has a more profound effect on the CFR skill than the calibration period and the statistical reconstruction method. The differences between the two methods are clearer for the MPI-ESM model due to its higher spatial resolution in the NA basin. The pseudo-proxy results of the CCSM4 model are closer to the pseudo-proxy results based on the reanalysis data set COBE2. Conducting PPEs using noise-contaminated pseudo-proxies instead of noise-free pseudo-proxies is important for the evaluation of the methods, as more spatial differences in the reconstruction skill are revealed. Both methods are appropriate for the reconstruction of the temporal evolution of the NA SSTs, even though they lead to a great loss of variance away from the proxy sites. Under reasonable assumptions about the characteristics of the non-climate noise in the proxy records, our results show that the marine network of Arctica

  3. Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human

    Science.gov (United States)

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape. PMID:25540777

  4. Comprehensive reconstruction and visualization of non-coding regulatory networks in human.

    Science.gov (United States)

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.

  5. Changes in the topology of gene expression networks by human immunodeficiency virus type 1 (HIV-1) integration in macrophages.

    Science.gov (United States)

    Soto-GirĂłn, MarĂ­a Juliana; GarcĂ­a-Vallejo, Felipe

    2012-01-01

    One key step of human immunodeficiency virus type 1 (HIV-1) infection is the integration of its viral cDNA. This process is mediated through complex networks of host-virus interactions that alter several normal cell functions of the host. To study the complexity of disturbances in cell gene expression networks by HIV-1 integration, we constructed a network of human macrophage genes located close to chromatin regions rich in proviruses. To perform the network analysis, we selected 28 genes previously identified as the target of cDNA integration and their transcriptional profiles were obtained from GEO Profiles (NCBI). A total of 2770 interactions among the 28 genes located around the HIV-1 proviruses in human macrophages formed a highly dense main network connected to five sub-networks. The overall network was significantly enriched by genes associated with signal transduction, cellular communication and regulatory processes. To simulate the effects of HIV-1 integration in infected macrophages, five genes with the most number of interaction in the normal network were turned off by putting in zero the correspondent expression values. The HIV-1 infected network showed changes in its topology and alteration in the macrophage functions reflected in a re-programming of biosynthetic and general metabolic process. Understanding the complex virus-host interactions that occur during HIV-1 integration, may provided valuable genomic information to develop new antiviral treatments focusing on the management of some specific gene expression networks associated with viral integration. This is the first gene network which describes the human macrophages genes interactions related with HIV-1 integration. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. An improved, bias-reduced probabilistic functional gene network of baker's yeast, Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Insuk Lee

    2007-10-01

    Full Text Available Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations.We report a significantly improved version (v. 2 of a probabilistic functional gene network of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis.YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome. YeastNet is available from http://www.yeastnet.org.

  7. Elucidating gene function and function evolution through comparison of co-expression networks in plants

    Directory of Open Access Journals (Sweden)

    Marek eMutwil

    2014-08-01

    Full Text Available The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23. In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We show that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that, in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution.

  8. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

    Science.gov (United States)

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-12-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction.

    Science.gov (United States)

    Kang, Eunhee; Min, Junhong; Ye, Jong Chul

    2017-10-01

    Due to the potential risk of inducing cancer, radiation exposure by X-ray CT devices should be reduced for routine patient scanning. However, in low-dose X-ray CT, severe artifacts typically occur due to photon starvation, beam hardening, and other causes, all of which decrease the reliability of the diagnosis. Thus, a high-quality reconstruction method from low-dose X-ray CT data has become a major research topic in the CT community. Conventional model-based de-noising approaches are, however, computationally very expensive, and image-domain de-noising approaches cannot readily remove CT-specific noise patterns. To tackle these problems, we want to develop a new low-dose X-ray CT algorithm based on a deep-learning approach. We propose an algorithm which uses a deep convolutional neural network (CNN) which is applied to the wavelet transform coefficients of low-dose CT images. More specifically, using a directional wavelet transform to extract the directional component of artifacts and exploit the intra- and inter- band correlations, our deep network can effectively suppress CT-specific noise. In addition, our CNN is designed with a residual learning architecture for faster network training and better performance. Experimental results confirm that the proposed algorithm effectively removes complex noise patterns from CT images derived from a reduced X-ray dose. In addition, we show that the wavelet-domain CNN is efficient when used to remove noise from low-dose CT compared to existing approaches. Our results were rigorously evaluated by several radiologists at the Mayo Clinic and won second place at the 2016 "Low-Dose CT Grand Challenge." To the best of our knowledge, this work is the first deep-learning architecture for low-dose CT reconstruction which has been rigorously evaluated and proven to be effective. In addition, the proposed algorithm, in contrast to existing model-based iterative reconstruction (MBIR) methods, has considerable potential to benefit from

  10. Directed partial correlation: inferring large-scale gene regulatory network through induced topology disruptions.

    Directory of Open Access Journals (Sweden)

    Yinyin Yuan

    Full Text Available Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/.

  11. Reverse engineering validation using a benchmark synthetic gene circuit in human cells.

    Science.gov (United States)

    Kang, Taek; White, Jacob T; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas

    2013-05-17

    Multicomponent biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network.

  12. Reconstruction and signal propagation analysis of the Syk signaling network in breast cancer cells.

    Directory of Open Access Journals (Sweden)

    Aurélien Naldi

    2017-03-01

    Full Text Available The ability to build in-depth cell signaling networks from vast experimental data is a key objective of computational biology. The spleen tyrosine kinase (Syk protein, a well-characterized key player in immune cell signaling, was surprisingly first shown by our group to exhibit an onco-suppressive function in mammary epithelial cells and corroborated by many other studies, but the molecular mechanisms of this function remain largely unsolved. Based on existing proteomic data, we report here the generation of an interaction-based network of signaling pathways controlled by Syk in breast cancer cells. Pathway enrichment of the Syk targets previously identified by quantitative phospho-proteomics indicated that Syk is engaged in cell adhesion, motility, growth and death. Using the components and interactions of these pathways, we bootstrapped the reconstruction of a comprehensive network covering Syk signaling in breast cancer cells. To generate in silico hypotheses on Syk signaling propagation, we developed a method allowing to rank paths between Syk and its targets. We first annotated the network according to experimental datasets. We then combined shortest path computation with random walk processes to estimate the importance of individual interactions and selected biologically relevant pathways in the network. Molecular and cell biology experiments allowed to distinguish candidate mechanisms that underlie the impact of Syk on the regulation of cortactin and ezrin, both involved in actin-mediated cell adhesion and motility. The Syk network was further completed with the results of our biological validation experiments. The resulting Syk signaling sub-networks can be explored via an online visualization platform.

  13. Pseudogenes regulate parental gene expression via ceRNA network.

    Science.gov (United States)

    An, Yang; Furber, Kendra L; Ji, Shaoping

    2017-01-01

    The concept of competitive endogenous RNA (ceRNA) was first proposed by Salmena and colleagues. Evidence suggests that pseudogene RNAs can act as a 'sponge' through competitive binding of common miRNA, releasing or attenuating repression through sequestering miRNAs away from parental mRNA. In theory, ceRNAs refer to all transcripts such as mRNA, tRNA, rRNA, long non-coding RNA, pseudogene RNA and circular RNA, because all of them may become the targets of miRNA depending on spatiotemporal situation. As binding of miRNA to the target RNA is not 100% complementary, it is possible that one miRNA can bind to multiple target RNAs and vice versa. All RNAs crosstalk through competitively binding to miRNAvia miRNA response elements (MREs) contained within the RNA sequences, thus forming a complex regulatory network. The ratio of a subset of miRNAs to the corresponding number of MREs determines repression strength on a given mRNA translation or stability. An increase in pseudogene RNA level can sequester miRNA and release repression on the parental gene, leading to an increase in parental gene expression. A massive number of transcripts constitute a complicated network that regulates each other through this proposed mechanism, though some regulatory significance may be mild or even undetectable. It is possible that the regulation of gene and pseudogene expression occurring in this manor involves all RNAs bearing common MREs. In this review, we will primarily discuss how pseudogene transcripts regulate expression of parental genes via ceRNA network and biological significance of regulation. © 2016 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

  14. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Predictive minimum description length principle approach to inferring gene regulatory networks.

    Science.gov (United States)

    Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping

    2011-01-01

    Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.

  16. Reconstruction of gene regulatory modules from RNA silencing of IFN-α modulators: experimental set-up and inference method.

    Science.gov (United States)

    Grassi, Angela; Di Camillo, Barbara; Ciccarese, Francesco; Agnusdei, Valentina; Zanovello, Paola; Amadori, Alberto; Finesso, Lorenzo; Indraccolo, Stefano; Toffolo, Gianna Maria

    2016-03-12

    Inference of gene regulation from expression data may help to unravel regulatory mechanisms involved in complex diseases or in the action of specific drugs. A challenging task for many researchers working in the field of systems biology is to build up an experiment with a limited budget and produce a dataset suitable to reconstruct putative regulatory modules worth of biological validation. Here, we focus on small-scale gene expression screens and we introduce a novel experimental set-up and a customized method of analysis to make inference on regulatory modules starting from genetic perturbation data, e.g. knockdown and overexpression data. To illustrate the utility of our strategy, it was applied to produce and analyze a dataset of quantitative real-time RT-PCR data, in which interferon-α (IFN-α) transcriptional response in endothelial cells is investigated by RNA silencing of two candidate IFN-α modulators, STAT1 and IFIH1. A putative regulatory module was reconstructed by our method, revealing an intriguing feed-forward loop, in which STAT1 regulates IFIH1 and they both negatively regulate IFNAR1. STAT1 regulation on IFNAR1 was object of experimental validation at the protein level. Detailed description of the experimental set-up and of the analysis procedure is reported, with the intent to be of inspiration for other scientists who want to realize similar experiments to reconstruct gene regulatory modules starting from perturbations of possible regulators. Application of our approach to the study of IFN-α transcriptional response modulators in endothelial cells has led to many interesting novel findings and new biological hypotheses worth of validation.

  17. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  18. Gene regulatory networks in lactation: identification of global principles using bioinformatics

    Directory of Open Access Journals (Sweden)

    Pollard Katherine S

    2007-11-01

    Full Text Available Abstract Background The molecular events underlying mammary development during pregnancy, lactation, and involution are incompletely understood. Results Mammary gland microarray data, cellular localization data, protein-protein interactions, and literature-mined genes were integrated and analyzed using statistics, principal component analysis, gene ontology analysis, pathway analysis, and network analysis to identify global biological principles that govern molecular events during pregnancy, lactation, and involution. Conclusion Several key principles were derived: (1 nearly a third of the transcriptome fluctuates to build, run, and disassemble the lactation apparatus; (2 genes encoding the secretory machinery are transcribed prior to lactation; (3 the diversity of the endogenous portion of the milk proteome is derived from fewer than 100 transcripts; (4 while some genes are differentially transcribed near the onset of lactation, the lactation switch is primarily post-transcriptionally mediated; (5 the secretion of materials during lactation occurs not by up-regulation of novel genomic functions, but by widespread transcriptional suppression of functions such as protein degradation and cell-environment communication; (6 the involution switch is primarily transcriptionally mediated; and (7 during early involution, the transcriptional state is partially reverted to the pre-lactation state. A new hypothesis for secretory diminution is suggested – milk production gradually declines because the secretory machinery is not transcriptionally replenished. A comprehensive network of protein interactions during lactation is assembled and new regulatory gene targets are identified. Less than one fifth of the transcriptionally regulated nodes in this lactation network have been previously explored in the context of lactation. Implications for future research in mammary and cancer biology are discussed.

  19. Systems Nutrigenomics Reveals Brain Gene Networks Linking Metabolic and Brain Disorders.

    Science.gov (United States)

    Meng, Qingying; Ying, Zhe; Noble, Emily; Zhao, Yuqi; Agrawal, Rahul; Mikhail, Andrew; Zhuang, Yumei; Tyagi, Ethika; Zhang, Qing; Lee, Jae-Hyung; Morselli, Marco; Orozco, Luz; Guo, Weilong; Kilts, Tina M; Zhu, Jun; Zhang, Bin; Pellegrini, Matteo; Xiao, Xinshu; Young, Marian F; Gomez-Pinilla, Fernando; Yang, Xia

    2016-05-01

    Nutrition plays a significant role in the increasing prevalence of metabolic and brain disorders. Here we employ systems nutrigenomics to scrutinize the genomic bases of nutrient-host interaction underlying disease predisposition or therapeutic potential. We conducted transcriptome and epigenome sequencing of hypothalamus (metabolic control) and hippocampus (cognitive processing) from a rodent model of fructose consumption, and identified significant reprogramming of DNA methylation, transcript abundance, alternative splicing, and gene networks governing cell metabolism, cell communication, inflammation, and neuronal signaling. These signals converged with genetic causal risks of metabolic, neurological, and psychiatric disorders revealed in humans. Gene network modeling uncovered the extracellular matrix genes Bgn and Fmod as main orchestrators of the effects of fructose, as validated using two knockout mouse models. We further demonstrate that an omega-3 fatty acid, DHA, reverses the genomic and network perturbations elicited by fructose, providing molecular support for nutritional interventions to counteract diet-induced metabolic and brain disorders. Our integrative approach complementing rodent and human studies supports the applicability of nutrigenomics principles to predict disease susceptibility and to guide personalized medicine. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  20. Ethylene-Related Gene Expression Networks in Wood Formation

    Directory of Open Access Journals (Sweden)

    Carolin Seyfferth

    2018-03-01

    Full Text Available Thickening of tree stems is the result of secondary growth, accomplished by the meristematic activity of the vascular cambium. Secondary growth of the stem entails developmental cascades resulting in the formation of secondary phloem outwards and secondary xylem (i.e., wood inwards of the stem. Signaling and transcriptional reprogramming by the phytohormone ethylene modifies cambial growth and cell differentiation, but the molecular link between ethylene and secondary growth remains unknown. We addressed this shortcoming by analyzing expression profiles and co-expression networks of ethylene pathway genes using the AspWood transcriptome database which covers all stages of secondary growth in aspen (Populus tremula stems. ACC synthase expression suggests that the ethylene precursor 1-aminocyclopropane-1-carboxylic acid (ACC is synthesized during xylem expansion and xylem cell maturation. Ethylene-mediated transcriptional reprogramming occurs during all stages of secondary growth, as deduced from AspWood expression profiles of ethylene-responsive genes. A network centrality analysis of the AspWood dataset identified EIN3D and 11 ERFs as hubs. No overlap was found between the co-expressed genes of the EIN3 and ERF hubs, suggesting target diversification and hence independent roles for these transcription factor families during normal wood formation. The EIN3D hub was part of a large co-expression gene module, which contained 16 transcription factors, among them several new candidates that have not been earlier connected to wood formation and a VND-INTERACTING 2 (VNI2 homolog. We experimentally demonstrated Populus EIN3D function in ethylene signaling in Arabidopsis thaliana. The ERF hubs ERF118 and ERF119 were connected on the basis of their expression pattern and gene co-expression module composition to xylem cell expansion and secondary cell wall formation, respectively. We hereby establish data resources for ethylene-responsive genes and

  1. Network Based Integrated Analysis of Phenotype-Genotype Data for Prioritization of Candidate Symptom Genes

    Directory of Open Access Journals (Sweden)

    Xing Li

    2014-01-01

    Full Text Available Background. Symptoms and signs (symptoms in brief are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM. To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms. Methods. This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms. Results. The proposed method gets reliable gene rank list with AUC (area under curve 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures. Conclusions. Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

  2. Transcriptional profiles of supragranular-enriched genes associate with corticocortical network architecture in the human brain.

    Science.gov (United States)

    Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C

    2016-01-26

    The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.

  3. Discovering time-lagged rules from microarray data using gene profile classifiers

    Directory of Open Access Journals (Sweden)

    Ponzoni Ignacio

    2011-04-01

    Full Text Available Abstract Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2, which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation.

  4. A novel gene network inference algorithm using predictive minimum description length approach.

    Science.gov (United States)

    Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

    2010-05-28

    Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the

  5. Simultaneous inference of phenotype-associated genes and relevant tissues from GWAS data via Bayesian integration of multiple tissue-specific gene networks.

    Science.gov (United States)

    Wu, Mengmeng; Lin, Zhixiang; Ma, Shining; Chen, Ting; Jiang, Rui; Wong, Wing Hung

    2017-12-01

    Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.

  6. Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations.

    Science.gov (United States)

    Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin

    2014-01-01

    The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.

  7. Depth Reconstruction from Single Images Using a Convolutional Neural Network and a Condition Random Field Model

    Directory of Open Access Journals (Sweden)

    Dan Liu

    2018-04-01

    Full Text Available This paper presents an effective approach for depth reconstruction from a single image through the incorporation of semantic information and local details from the image. A unified framework for depth acquisition is constructed by joining a deep Convolutional Neural Network (CNN and a continuous pairwise Conditional Random Field (CRF model. Semantic information and relative depth trends of local regions inside the image are integrated into the framework. A deep CNN network is firstly used to automatically learn a hierarchical feature representation of the image. To get more local details in the image, the relative depth trends of local regions are incorporated into the network. Combined with semantic information of the image, a continuous pairwise CRF is then established and is used as the loss function of the unified model. Experiments on real scenes demonstrate that the proposed approach is effective and that the approach obtains satisfactory results.

  8. Depth Reconstruction from Single Images Using a Convolutional Neural Network and a Condition Random Field Model.

    Science.gov (United States)

    Liu, Dan; Liu, Xuejun; Wu, Yiguang

    2018-04-24

    This paper presents an effective approach for depth reconstruction from a single image through the incorporation of semantic information and local details from the image. A unified framework for depth acquisition is constructed by joining a deep Convolutional Neural Network (CNN) and a continuous pairwise Conditional Random Field (CRF) model. Semantic information and relative depth trends of local regions inside the image are integrated into the framework. A deep CNN network is firstly used to automatically learn a hierarchical feature representation of the image. To get more local details in the image, the relative depth trends of local regions are incorporated into the network. Combined with semantic information of the image, a continuous pairwise CRF is then established and is used as the loss function of the unified model. Experiments on real scenes demonstrate that the proposed approach is effective and that the approach obtains satisfactory results.

  9. Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

    OpenAIRE

    Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

    2009-01-01

    Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...

  10. Biomarker Gene Signature Discovery Integrating Network Knowledge

    Directory of Open Access Journals (Sweden)

    Holger Fröhlich

    2012-02-01

    Full Text Available Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

  11. Gene Network for Identifying the Entropy Changes of Different Modules in Pediatric Sepsis

    Directory of Open Access Journals (Sweden)

    Jing Yang

    2016-12-01

    Full Text Available Background/Aims: Pediatric sepsis is a disease that threatens life of children. The incidence of pediatric sepsis is higher in developing countries due to various reasons, such as insufficient immunization and nutrition, water and air pollution, etc. Exploring the potential genes via different methods is of significance for the prevention and treatment of pediatric sepsis. This study aimed to identify potential genes associated with pediatric sepsis utilizing analysis of gene network and entropy. Methods: The mRNA expression in the blood samples collected from 20 septic children and 30 healthy controls was quantified by using Affymetrix HG-U133A microarray. Two condition-specific protein-protein interaction networks (PINs, one for the healthy control and the other one for the children with sepsis, were deduced by combining the fundamental human PINs with gene expression profiles in the two phenotypes. Subsequently, distinct modules from the two conditional networks were extracted by adopting a maximal clique-merging approach. Delta entropy (ΔS was calculated between sepsis and control modules. Results: Then, key genes displaying changes in gene composition were identified by matching the control and sepsis modules. Two objective modules were obtained, in which ribosomal protein RPL4 and RPL9 as well as TOP2A were probably considered as the key genes differentiating sepsis from healthy controls. Conclusion: According to previous reports and this work, TOP2A is the potential gene therapy target for pediatric sepsis. The relationship between pediatric sepsis and RPL4 and RPL9 needs further investigation.

  12. Causal structure of oscillations in gene regulatory networks: Boolean analysis of ordinary differential equation attractors.

    Science.gov (United States)

    Sun, Mengyang; Cheng, Xianrui; Socolar, Joshua E S

    2013-06-01

    A common approach to the modeling of gene regulatory networks is to represent activating or repressing interactions using ordinary differential equations for target gene concentrations that include Hill function dependences on regulator gene concentrations. An alternative formulation represents the same interactions using Boolean logic with time delays associated with each network link. We consider the attractors that emerge from the two types of models in the case of a simple but nontrivial network: a figure-8 network with one positive and one negative feedback loop. We show that the different modeling approaches give rise to the same qualitative set of attractors with the exception of a possible fixed point in the ordinary differential equation model in which concentrations sit at intermediate values. The properties of the attractors are most easily understood from the Boolean perspective, suggesting that time-delay Boolean modeling is a useful tool for understanding the logic of regulatory networks.

  13. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  14. Genome-wide Reconstruction of OxyR and SoxRS Transcriptional Regulatory Networks under Oxidative Stress in Escherichia coli K-12 MG1655

    Directory of Open Access Journals (Sweden)

    Sang Woo Seo

    2015-08-01

    Full Text Available Three transcription factors (TFs, OxyR, SoxR, and SoxS, play a critical role in transcriptional regulation of the defense system for oxidative stress in bacteria. However, their full genome-wide regulatory potential is unknown. Here, we perform a genome-scale reconstruction of the OxyR, SoxR, and SoxS regulons in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 68 genes in 51 transcription units (TUs belong to these regulons. Among them, 48 genes showed more than 2-fold changes in expression level under single-TF-knockout conditions. This reconstruction expands the genome-wide roles of these factors to include direct activation of genes related to amino acid biosynthesis (methionine and aromatic amino acids, cell wall synthesis (lipid A biosynthesis and peptidoglycan growth, and divalent metal ion transport (Mn2+, Zn2+, and Mg2+. Investigating the co-regulation of these genes with other stress-response TFs reveals that they are independently regulated by stress-specific TFs.

  15. Increasing the power to detect causal associations by combining genotypic and expression data in segregating populations.

    Directory of Open Access Journals (Sweden)

    Jun Zhu

    2007-04-01

    Full Text Available To dissect common human diseases such as obesity and diabetes, a systematic approach is needed to study how genes interact with one another, and with genetic and environmental factors, to determine clinical end points or disease phenotypes. Bayesian networks provide a convenient framework for extracting relationships from noisy data and are frequently applied to large-scale data to derive causal relationships among variables of interest. Given the complexity of molecular networks underlying common human disease traits, and the fact that biological networks can change depending on environmental conditions and genetic factors, large datasets, generally involving multiple perturbations (experiments, are required to reconstruct and reliably extract information from these networks. With limited resources, the balance of coverage of multiple perturbations and multiple subjects in a single perturbation needs to be considered in the experimental design. Increasing the number of experiments, or the number of subjects in an experiment, is an expensive and time-consuming way to improve network reconstruction. Integrating multiple types of data from existing subjects might be more efficient. For example, it has recently been demonstrated that combining genotypic and gene expression data in a segregating population leads to improved network reconstruction, which in turn may lead to better predictions of the effects of experimental perturbations on any given gene. Here we simulate data based on networks reconstructed from biological data collected in a segregating mouse population and quantify the improvement in network reconstruction achieved using genotypic and gene expression data, compared with reconstruction using gene expression data alone. We demonstrate that networks reconstructed using the combined genotypic and gene expression data achieve a level of reconstruction accuracy that exceeds networks reconstructed from expression data alone, and that

  16. Inference of gene regulatory networks from time series by Tsallis entropy

    Directory of Open Access Journals (Sweden)

    de Oliveira Evaldo A

    2011-05-01

    Full Text Available Abstract Background The inference of gene regulatory networks (GRNs from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information, a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 â

  17. In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer

    Science.gov (United States)

    Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-01

    Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723

  18. Identification of conserved drought stress responsive gene-network across tissues and developmental stages in rice.

    Science.gov (United States)

    Smita, Shuchi; Katiyar, Amit; Pandey, Dev Mani; Chinnusamy, Viswanathan; Archak, Sunil; Bansal, Kailash Chander

    2013-01-01

    Identification of genes that are coexpressed across various tissues and environmental stresses is biologically interesting, since they may play coordinated role in similar biological processes. Genes with correlated expression patterns can be best identified by using coexpression network analysis of transcriptome data. In the present study, we analyzed the temporal-spatial coordination of gene expression in root, leaf and panicle of rice under drought stress and constructed network using WGCNA and Cytoscape. Total of 2199 differentially expressed genes (DEGs) were identified in at least three or more tissues, wherein 88 genes have coordinated expression profile among all the six tissues under drought stress. These 88 highly coordinated genes were further subjected to module identification in the coexpression network. Based on chief topological properties we identified 18 hub genes such as ABC transporter, ATP-binding protein, dehydrin, protein phosphatase 2C, LTPL153 - Protease inhibitor, phosphatidylethanolaminebinding protein, lactose permease-related, NADP-dependent malic enzyme, etc. Motif enrichment analysis showed the presence of ABRE cis-elements in the promoters of > 62% of the coordinately expressed genes. Our results suggest that drought stress mediated upregulated gene expression was coordinated through an ABA-dependent signaling pathway across tissues, at least for the subset of genes identified in this study, while down regulation appears to be regulated by tissue specific pathways in rice.

  19. GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.

    Science.gov (United States)

    Schaffter, Thomas; Marbach, Daniel; Floreano, Dario

    2011-08-15

    Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. dario.floreano@epfl.ch.

  20. Biophysical Constraints Arising from Compositional Context in Synthetic Gene Networks.

    Science.gov (United States)

    Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M

    2017-07-26

    Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

    DEFF Research Database (Denmark)

    Cho, Byung-Kwan; Kim, Donghyuk; Knight, Eric M.

    2014-01-01

    Background: At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a sigma-factor to recognize the genomic location at which the process initiates. Although the crucial role of sigma-factors has long been appreciated and characterized for many individual...... to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative sigma-factors (the sigma(70) and sigma(38) regulons), confirming the competition model of sigma substitution...

  2. Dynamic and modular gene regulatory networks drive the development of gametogenesis.

    Science.gov (United States)

    Che, Dongxue; Wang, Yang; Bai, Weiyang; Li, Leijie; Liu, Guiyou; Zhang, Liangcai; Zuo, Yongchun; Tao, Shiheng; Hua, Jinlian; Liao, Mingzhi

    2017-07-01

    Gametogenesis is a complex process, which includes mitosis and meiosis and results in the production of ovum and sperm. The development of gametogenesis is dynamic and needs many different genes to work synergistically, but it is lack of global perspective research about this process. In this study, we detected the dynamic process of gametogenesis from the perspective of systems biology based on protein-protein interaction networks (PPINs) and functional analysis. Results showed that gametogenesis genes have strong synergistic effects in PPINs within and between different phases during the development. Addition to the synergistic effects on molecular networks, gametogenesis genes showed functional consistency within and between different phases, which provides the further evidence about the dynamic process during the development of gametogenesis. At last, we detected and provided the core molecular modules of different phases about gametogenesis. The gametogenesis genes and related modules can be obtained from our Web site Gametogenesis Molecule Online (GMO, http://gametsonline.nwsuaflmz.com/index.php), which is freely accessible. GMO may be helpful for the reference and application of these genes and modules in the future identification of key genes about gametogenesis. Summary, this work provided a computational perspective and frame to the analysis of the gametogenesis dynamics and modularity in both human and mouse. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

    Science.gov (United States)

    Obayashi, Takeshi; Kinoshita, Kengo

    2010-05-01

    Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.

  4. Step patterns on vicinal reconstructed surfaces

    Science.gov (United States)

    Vilfan, Igor

    1996-04-01

    Step patterns on vicinal (2 × 1) reconstructed surfaces of noble metals Au(110) and Pt(110), miscut towards the (100) orientation, are investigated. The free energy of the reconstructed surface with a network of crossing opposite steps is calculated in the strong chirality regime when the steps cannot make overhangs. It is explained why the steps are not perpendicular to the direction of the miscut but form in equilibrium a network of crossing steps which make the surface to look like a fish skin. The network formation is the consequence of competition between the — predominantly elastic — energy loss and entropy gain. It is in agreement with recent scanning tunnelling microscopy observations on vicinal Au(110) and Pt(110) surfaces.

  5. Inferring transcriptional gene regulation network of starch metabolism in Arabidopsis thaliana leaves using graphical Gaussian model

    Directory of Open Access Journals (Sweden)

    Ingkasuwan Papapit

    2012-08-01

    Full Text Available Abstract Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM. Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF. A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090, which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene. The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070 and constans-like (COL: At2g21320, were identified as positive regulators of starch synthase 4 (SS4: At4g18240. The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray

  6. Indian-ink perfusion based method for reconstructing continuous vascular networks in whole mouse brain.

    Directory of Open Access Journals (Sweden)

    Songchao Xue

    Full Text Available The topology of the cerebral vasculature, which is the energy transport corridor of the brain, can be used to study cerebral circulatory pathways. Limited by the restrictions of the vascular markers and imaging methods, studies on cerebral vascular structure now mainly focus on either observation of the macro vessels in a whole brain or imaging of the micro vessels in a small region. Simultaneous vascular studies of arteries, veins and capillaries have not been achieved in the whole brain of mammals. Here, we have combined the improved gelatin-Indian ink vessel perfusion process with Micro-Optical Sectioning Tomography for imaging the vessel network of an entire mouse brain. With 17 days of work, an integral dataset for the entire cerebral vessels was acquired. The voxel resolution is 0.35×0.4×2.0 µm(3 for the whole brain. Besides the observations of fine and complex vascular networks in the reconstructed slices and entire brain views, a representative continuous vascular tracking has been demonstrated in the deep thalamus. This study provided an effective method for studying the entire macro and micro vascular networks of mouse brain simultaneously.

  7. Using Morpholinos to Probe Gene Networks in Sea Urchin.

    Science.gov (United States)

    Materna, Stefan C

    2017-01-01

    The control processes that underlie the progression of development can be summarized in maps of gene regulatory networks (GRNs). A critical step in their assembly is the systematic perturbation of network candidates. In sea urchins the most important method for interfering with expression in a gene-specific way is application of morpholino antisense oligonucleotides (MOs). MOs act by binding to their sequence complement in transcripts resulting in a block in translation or a change in splicing and thus result in a loss of function. Despite the tremendous success of this technology, recent comparisons to mutants generated by genome editing have led to renewed criticism and challenged its reliability. As with all methods based on sequence recognition, MOs are prone to off-target binding that may result in phenotypes that are erroneously ascribed to the loss of the intended target. However, the slow progression of development in sea urchins has enabled extremely detailed studies of gene activity in the embryo. This wealth of knowledge paired with the simplicity of the sea urchin embryo enables careful analysis of MO phenotypes through a variety of methods that do not rely on terminal phenotypes. This article summarizes the use of MOs in probing GRNs and the steps that should be taken to assure their specificity.

  8. Reconstruction and analysis of the lncRNA-miRNA-mRNA network based on competitive endogenous RNA reveal functional lncRNAs in rheumatoid arthritis.

    Science.gov (United States)

    Jiang, Hui; Ma, Rong; Zou, Shubiao; Wang, Yongzhong; Li, Zhuqing; Li, Weiping

    2017-06-01

    Rheumatoid arthritis (RA) is an autoimmune disease with an unknown etiology, occurring in approximately 1.0% of general population. More and more studies have suggested that long non-coding RNAs (lncRNAs) could play important roles in various biological processes and be associated with the pathogenesis of different kinds of diseases including RA. Although a large number of lncRNAs have been found, our knowledge of their function and physiological/pathological significance is still in its infancy. In order to reveal functional lncRNAs and identify the key lncRNAs in RA, we reconstructed a global triple network based on the competitive endogenous RNA (ceRNA) theory using the data from National Center for Biotechnology Information Gene Expression Omnibus and our previous paper. Meanwhile, Gene Ontology (GO) and pathway analysis were performed using Cytoscape plug-in BinGO and Database for Annotation, Visualization, and Integration Discovery (DAVID), respectively. We found that the lncRNA-miRNA-mRNA network was composed of 7 lncRNA nodes, 90 mRNA nodes, 24 miRNA nodes, and 301 edges. The functional assay showed that 147 GO terms and 23 pathways were enriched. In addition, three lncRNAs (S5645.1, XR_006437.1, J01878) were highly related to RA, and therefore, were selected as key lncRNAs. This study suggests that specific lncRNAs are associated with the development of RA, and three lncRNAs (S5645.1, XR_006437.1, J01878) could be used as potential diagnostic biomarkers and therapeutic targets.

  9. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    Science.gov (United States)

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  10. Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2005-09-01

    Full Text Available Abstract Background Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain. Results We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories. Conclusion We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing

  11. Network analysis of differential expression for the identification of disease-causing genes.

    Directory of Open Access Journals (Sweden)

    Daniela Nitsch

    Full Text Available Genetic studies (in particular linkage and association studies identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved. We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes.

  12. Prediction of quantitative phenotypes based on genetic networks: a case study in yeast sporulation

    Directory of Open Access Journals (Sweden)

    Shen Li

    2010-09-01

    Full Text Available Abstract Background An exciting application of genetic network is to predict phenotypic consequences for environmental cues or genetic perturbations. However, de novo prediction for quantitative phenotypes based on network topology is always a challenging task. Results Using yeast sporulation as a model system, we have assembled a genetic network from literature and exploited Boolean network to predict sporulation efficiency change upon deleting individual genes. We observe that predictions based on the curated network correlate well with the experimentally measured values. In addition, computational analysis reveals the robustness and hysteresis of the yeast sporulation network and uncovers several patterns of sporulation efficiency change caused by double gene deletion. These discoveries may guide future investigation of underlying mechanisms. We have also shown that a hybridized genetic network reconstructed from both temporal microarray data and literature is able to achieve a satisfactory prediction accuracy of the same quantitative phenotypes. Conclusions This case study illustrates the value of predicting quantitative phenotypes based on genetic network and provides a generic approach.

  13. Inferring regulatory networks from expression data using tree-based methods.

    Directory of Open Access Journals (Sweden)

    Vân Anh Huynh-Thu

    2010-09-01

    Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.

  14. Deep Convolutional Networks for Event Reconstruction and Particle Tagging on NOvA and DUNE

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Deep Convolutional Neural Networks (CNNs) have been widely applied in computer vision to solve complex problems in image recognition and analysis. In recent years many efforts have emerged to extend the use of this technology to HEP applications, including the Convolutional Visual Network (CVN), our implementation for identification of neutrino events. In this presentation I will describe the core concepts of CNNs, the details of our particular implementation in the Caffe framework and our application to identify NOvA events. NOvA is a long baseline neutrino experiment whose main goal is the measurement of neutrino oscillations. This relies on the accurate identification and reconstruction of the neutrino flavor in the interactions we observe. In 2016 the NOvA experiment released results for the observation of oscillations in the ν μ → ν e channel, the first HEP result employing CNNs. I will also discuss our approach at event identification on NOvA as well as recent developments in the application of CNN...

  15. Distilling a Visual Network of Retinitis Pigmentosa Gene-Protein Interactions to Uncover New Disease Candidates.

    Directory of Open Access Journals (Sweden)

    Daniel Boloc

    Full Text Available Retinitis pigmentosa (RP is a highly heterogeneous genetic visual disorder with more than 70 known causative genes, some of them shared with other non-syndromic retinal dystrophies (e.g. Leber congenital amaurosis, LCA. The identification of RP genes has increased steadily during the last decade, and the 30% of the cases that still remain unassigned will soon decrease after the advent of exome/genome sequencing. A considerable amount of genetic and functional data on single RD genes and mutations has been gathered, but a comprehensive view of the RP genes and their interacting partners is still very fragmentary. This is the main gap that needs to be filled in order to understand how mutations relate to progressive blinding disorders and devise effective therapies.We have built an RP-specific network (RPGeNet by merging data from different sources: high-throughput data from BioGRID and STRING databases, manually curated data for interactions retrieved from iHOP, as well as interactions filtered out by syntactical parsing from up-to-date abstracts and full-text papers related to the RP research field. The paths emerging when known RP genes were used as baits over the whole interactome have been analysed, and the minimal number of connections among the RP genes and their close neighbors were distilled in order to simplify the search space.In contrast to the analysis of single isolated genes, finding the networks linking disease genes renders powerful etiopathological insights. We here provide an interactive interface, RPGeNet, for the molecular biologist to explore the network centered on the non-syndromic and syndromic RP and LCA causative genes. By integrating tissue-specific expression levels and phenotypic data on top of that network, a more comprehensive biological view will highlight key molecular players of retinal degeneration and unveil new RP disease candidates.

  16. Genome-wide identification of key modulators of gene-gene interaction networks in breast cancer.

    Science.gov (United States)

    Chiu, Yu-Chiao; Wang, Li-Ju; Hsiao, Tzu-Hung; Chuang, Eric Y; Chen, Yidong

    2017-10-03

    With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.

  17. Image reconstruction by domain-transform manifold learning

    Science.gov (United States)

    Zhu, Bo; Liu, Jeremiah Z.; Cauley, Stephen F.; Rosen, Bruce R.; Rosen, Matthew S.

    2018-03-01

    Image reconstruction is essential for imaging applications across the physical and life sciences, including optical and radar systems, magnetic resonance imaging, X-ray computed tomography, positron emission tomography, ultrasound imaging and radio astronomy. During image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the exact inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain, the composition of which depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. Here we present a unified framework for image reconstruction—automated transform by manifold approximation (AUTOMAP)—which recasts image reconstruction as a data-driven supervised learning task that allows a mapping between the sensor and the image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for various magnetic resonance imaging acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate that manifold learning during training results in sparse representations of domain transforms along low-dimensional data manifolds, and observe superior immunity to noise and a reduction in reconstruction artefacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate that AUTOMAP and other learned reconstruction approaches will accelerate the development

  18. A contribution to the study of plant development evolution based on gene co-expression networks

    Directory of Open Access Journals (Sweden)

    Francisco J. Romero-Campero

    2013-08-01

    Full Text Available Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms.

  19. Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

    Directory of Open Access Journals (Sweden)

    Lemke Ney

    2009-09-01

    Full Text Available Abstract Background The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes. Results We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality. Conclusion We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing

  20. Reconstruction of in-plane strain maps using hybrid dense sensor network composed of sensing skin

    International Nuclear Information System (INIS)

    Downey, Austin; Laflamme, Simon; Ubertini, Filippo

    2016-01-01

    The authors have recently developed a soft-elastomeric capacitive (SEC)-based thin film sensor for monitoring strain on mesosurfaces. Arranged in a network configuration, the sensing system is analogous to a biological skin, where local strain can be monitored over a global area. Under plane stress conditions, the sensor output contains the additive measurement of the two principal strain components over the monitored surface. In applications where the evaluation of strain maps is useful, in structural health monitoring for instance, such signal must be decomposed into linear strain components along orthogonal directions. Previous work has led to an algorithm that enabled such decomposition by leveraging a dense sensor network configuration with the addition of assumed boundary conditions. Here, we significantly improve the algorithm’s accuracy by leveraging mature off-the-shelf solutions to create a hybrid dense sensor network (HDSN) to improve on the boundary condition assumptions. The system’s boundary conditions are enforced using unidirectional RSGs and assumed virtual sensors. Results from an extensive experimental investigation demonstrate the good performance of the proposed algorithm and its robustness with respect to sensors’ layout. Overall, the proposed algorithm is seen to effectively leverage the advantages of a hybrid dense network for application of the thin film sensor to reconstruct surface strain fields over large surfaces. (paper)