DREAM3: network inference using dynamic context likelihood of relatedness and the inferelator.
Directory of Open Access Journals (Sweden)
Aviv Madar
2010-03-01
Full Text Available Many current works aiming to learn regulatory networks from systems biology data must balance model complexity with respect to data availability and quality. Methods that learn regulatory associations based on unit-less metrics, such as Mutual Information, are attractive in that they scale well and reduce the number of free parameters (model complexity per interaction to a minimum. In contrast, methods for learning regulatory networks based on explicit dynamical models are more complex and scale less gracefully, but are attractive as they may allow direct prediction of transcriptional dynamics and resolve the directionality of many regulatory interactions.We aim to investigate whether scalable information based methods (like the Context Likelihood of Relatedness method and more explicit dynamical models (like Inferelator 1.0 prove synergistic when combined. We test a pipeline where a novel modification of the Context Likelihood of Relatedness (mixed-CLR, modified to use time series data is first used to define likely regulatory interactions and then Inferelator 1.0 is used for final model selection and to build an explicit dynamical model.Our method ranked 2nd out of 22 in the DREAM3 100-gene in silico networks challenge. Mixed-CLR and Inferelator 1.0 are complementary, demonstrating a large performance gain relative to any single tested method, with precision being especially high at low recall values. Partitioning the provided data set into four groups (knock-down, knock-out, time-series, and combined revealed that using comprehensive knock-out data alone provides optimal performance. Inferelator 1.0 proved particularly powerful at resolving the directionality of regulatory interactions, i.e. "who regulates who" (approximately of identified true positives were correctly resolved. Performance drops for high in-degree genes, i.e. as the number of regulators per target gene increases, but not with out-degree, i.e. performance is not affected by
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Inferring network structure from cascades
Ghonge, Sushrut; Vural, Dervis Can
2017-07-01
Many physical, biological, and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we offer three topological methods to infer the structure of any directed network given a set of cascade arrival times. Our formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for several different cascade models.
Schaffter, Thomas; Marbach, Daniel; Floreano, Dario
2011-08-15
Over the last decade, numerous methods have been developed for inference of regulatory networks from gene expression data. However, accurate and systematic evaluation of these methods is hampered by the difficulty of constructing adequate benchmarks and the lack of tools for a differentiated analysis of network predictions on such benchmarks. Here, we describe a novel and comprehensive method for in silico benchmark generation and performance profiling of network inference methods available to the community as an open-source software called GeneNetWeaver (GNW). In addition to the generation of detailed dynamical models of gene regulatory networks to be used as benchmarks, GNW provides a network motif analysis that reveals systematic prediction errors, thereby indicating potential ways of improving inference methods. The accuracy of network inference methods is evaluated using standard metrics such as precision-recall and receiver operating characteristic curves. We show how GNW can be used to assess the performance and identify the strengths and weaknesses of six inference methods. Furthermore, we used GNW to provide the international Dialogue for Reverse Engineering Assessments and Methods (DREAM) competition with three network inference challenges (DREAM3, DREAM4 and DREAM5). GNW is available at http://gnw.sourceforge.net along with its Java source code, user manual and supporting data. Supplementary data are available at Bioinformatics online. dario.floreano@epfl.ch.
Inference in hybrid Bayesian networks
International Nuclear Information System (INIS)
Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio
2009-01-01
Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Reddy, Vikas; Farr, Anna Charisse; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K D V
2014-01-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
A Bayesian Network Schema for Lessening Database Inference
National Research Council Canada - National Science Library
Chang, LiWu; Moskowitz, Ira S
2001-01-01
.... The authors introduce a formal schema for database inference analysis, based upon a Bayesian network structure, which identifies critical parameters involved in the inference problem and represents...
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Impact of noise on molecular network inference.
Directory of Open Access Journals (Sweden)
Radhakrishnan Nagarajan
Full Text Available Molecular entities work in concert as a system and mediate phenotypic outcomes and disease states. There has been recent interest in modelling the associations between molecular entities from their observed expression profiles as networks using a battery of algorithms. These networks have proven to be useful abstractions of the underlying pathways and signalling mechanisms. Noise is ubiquitous in molecular data and can have a pronounced effect on the inferred network. Noise can be an outcome of several factors including: inherent stochastic mechanisms at the molecular level, variation in the abundance of molecules, heterogeneity, sensitivity of the biological assay or measurement artefacts prevalent especially in high-throughput settings. The present study investigates the impact of discrepancies in noise variance on pair-wise dependencies, conditional dependencies and constraint-based Bayesian network structure learning algorithms that incorporate conditional independence tests as a part of the learning process. Popular network motifs and fundamental connections, namely: (a common-effect, (b three-chain, and (c coherent type-I feed-forward loop (FFL are investigated. The choice of these elementary networks can be attributed to their prevalence across more complex networks. Analytical expressions elucidating the impact of discrepancies in noise variance on pairwise dependencies and conditional dependencies for special cases of these motifs are presented. Subsequently, the impact of noise on two popular constraint-based Bayesian network structure learning algorithms such as Grow-Shrink (GS and Incremental Association Markov Blanket (IAMB that implicitly incorporate tests for conditional independence is investigated. Finally, the impact of noise on networks inferred from publicly available single cell molecular expression profiles is investigated. While discrepancies in noise variance are overlooked in routine molecular network inference, the
Inferring network topology from complex dynamics
International Nuclear Information System (INIS)
Shandilya, Srinivas Gorur; Timme, Marc
2011-01-01
Inferring the network topology from dynamical observations is a fundamental problem pervading research on complex systems. Here, we present a simple, direct method for inferring the structural connection topology of a network, given an observation of one collective dynamical trajectory. The general theoretical framework is applicable to arbitrary network dynamical systems described by ordinary differential equations. No interference (external driving) is required and the type of dynamics is hardly restricted in any way. In particular, the observed dynamics may be arbitrarily complex; stationary, invariant or transient; synchronous or asynchronous and chaotic or periodic. Presupposing a knowledge of the functional form of the dynamical units and of the coupling functions between them, we present an analytical solution to the inverse problem of finding the network topology from observing a time series of state variables only. Robust reconstruction is achieved in any sufficiently long generic observation of the system. We extend our method to simultaneously reconstructing both the entire network topology and all parameters appearing linear in the system's equations of motion. Reconstruction of network topology and system parameters is viable even in the presence of external noise that distorts the original dynamics substantially. The method provides a conceptually new step towards reconstructing a variety of real-world networks, including gene and protein interaction networks and neuronal circuits.
Inferring epidemic network topology from surveillance data.
Directory of Open Access Journals (Sweden)
Xiang Wan
Full Text Available The transmission of infectious diseases can be affected by many or even hidden factors, making it difficult to accurately predict when and where outbreaks may emerge. One approach at the moment is to develop and deploy surveillance systems in an effort to detect outbreaks as timely as possible. This enables policy makers to modify and implement strategies for the control of the transmission. The accumulated surveillance data including temporal, spatial, clinical, and demographic information, can provide valuable information with which to infer the underlying epidemic networks. Such networks can be quite informative and insightful as they characterize how infectious diseases transmit from one location to another. The aim of this work is to develop a computational model that allows inferences to be made regarding epidemic network topology in heterogeneous populations. We apply our model on the surveillance data from the 2009 H1N1 pandemic in Hong Kong. The inferred epidemic network displays significant effect on the propagation of infectious diseases.
Quantum Enhanced Inference in Markov Logic Networks.
Wittek, Peter; Gogolin, Christian
2017-04-19
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Facility Activity Inference Using Radiation Networks
Energy Technology Data Exchange (ETDEWEB)
Rao, Nageswara S. [ORNL; Ramirez Aviles, Camila A. [ORNL
2017-11-01
We consider the problem of inferring the operational status of a reactor facility using measurements from a radiation sensor network deployed around the facility’s ventilation off-gas stack. The intensity of stack emissions decays with distance, and the sensor counts or measurements are inherently random with parameters determined by the intensity at the sensor’s location. We utilize the measurements to estimate the intensity at the stack, and use it in a one-sided Sequential Probability Ratio Test (SPRT) to infer on/off status of the reactor. We demonstrate the superior performance of this method over conventional majority fusers and individual sensors using (i) test measurements from a network of 21 NaI detectors, and (ii) effluence measurements collected at the stack of a reactor facility. We also analytically establish the superior detection performance of the network over individual sensors with fixed and adaptive thresholds by utilizing the Poisson distribution of the counts. We quantify the performance improvements of the network detection over individual sensors using the packing number of the intensity space.
Quantum Enhanced Inference in Markov Logic Networks
Wittek, Peter; Gogolin, Christian
2017-04-01
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Inferring Phylogenetic Networks from Gene Order Data
Directory of Open Access Journals (Sweden)
Alexey Anatolievich Morozov
2013-01-01
Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.
Network inference via adaptive optimal design
Directory of Open Access Journals (Sweden)
Stigter Johannes D
2012-09-01
Full Text Available Abstract Background Current research in network reverse engineering for genetic or metabolic networks very often does not include a proper experimental and/or input design. In this paper we address this issue in more detail and suggest a method that includes an iterative design of experiments based, on the most recent data that become available. The presented approach allows a reliable reconstruction of the network and addresses an important issue, i.e., the analysis and the propagation of uncertainties as they exist in both the data and in our own knowledge. These two types of uncertainties have their immediate ramifications for the uncertainties in the parameter estimates and, hence, are taken into account from the very beginning of our experimental design. Findings The method is demonstrated for two small networks that include a genetic network for mRNA synthesis and degradation and an oscillatory network describing a molecular network underlying adenosine 3’-5’ cyclic monophosphate (cAMP as observed in populations of Dyctyostelium cells. In both cases a substantial reduction in parameter uncertainty was observed. Extension to larger scale networks is possible but needs a more rigorous parameter estimation algorithm that includes sparsity as a constraint in the optimization procedure. Conclusion We conclude that a careful experiment design very often (but not always pays off in terms of reliability in the inferred network topology. For large scale networks a better parameter estimation algorithm is required that includes sparsity as an additional constraint. These algorithms are available in the literature and can also be used in an adaptive optimal design setting as demonstrated in this paper.
Inferring gene networks from discrete expression data
Zhang, L.
2013-07-18
The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.
Inferring general relations between network characteristics from specific network ensembles.
Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan
2012-01-01
Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.
Probabilistic logic networks a comprehensive framework for uncertain inference
Goertzel, Ben; Goertzel, Izabela Freire; Heljakka, Ari
2008-01-01
This comprehensive book describes Probabilistic Logic Networks (PLN), a novel conceptual, mathematical and computational approach to uncertain inference. A broad scope of reasoning types are considered.
Fused Regression for Multi-source Gene Regulatory Network Inference.
Directory of Open Access Journals (Sweden)
Kari Y Lam
2016-12-01
Full Text Available Understanding gene regulatory networks is critical to understanding cellular differentiation and response to external stimuli. Methods for global network inference have been developed and applied to a variety of species. Most approaches consider the problem of network inference independently in each species, despite evidence that gene regulation can be conserved even in distantly related species. Further, network inference is often confined to single data-types (single platforms and single cell types. We introduce a method for multi-source network inference that allows simultaneous estimation of gene regulatory networks in multiple species or biological processes through the introduction of priors based on known gene relationships such as orthology incorporated using fused regression. This approach improves network inference performance even when orthology mapping and conservation are incomplete. We refine this method by presenting an algorithm that extracts the true conserved subnetwork from a larger set of potentially conserved interactions and demonstrate the utility of our method in cross species network inference. Last, we demonstrate our method's utility in learning from data collected on different experimental platforms.
An algebra-based method for inferring gene regulatory networks.
Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard
2014-03-26
The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
Inferring time-varying network topologies from gene expression data.
Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas
2007-01-01
Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Lim, Néhémy; Senbabaoglu, Yasin; Michailidis, George; d'Alché-Buc, Florence
2013-06-01
Reverse engineering of gene regulatory networks remains a central challenge in computational systems biology, despite recent advances facilitated by benchmark in silico challenges that have aided in calibrating their performance. A number of approaches using either perturbation (knock-out) or wild-type time-series data have appeared in the literature addressing this problem, with the latter using linear temporal models. Nonlinear dynamical models are particularly appropriate for this inference task, given the generation mechanism of the time-series data. In this study, we introduce a novel nonlinear autoregressive model based on operator-valued kernels that simultaneously learns the model parameters, as well as the network structure. A flexible boosting algorithm (OKVAR-Boost) that shares features from L2-boosting and randomization-based algorithms is developed to perform the tasks of parameter learning and network inference for the proposed model. Specifically, at each boosting iteration, a regularized Operator-valued Kernel-based Vector AutoRegressive model (OKVAR) is trained on a random subnetwork. The final model consists of an ensemble of such models. The empirical estimation of the ensemble model's Jacobian matrix provides an estimation of the network structure. The performance of the proposed algorithm is first evaluated on a number of benchmark datasets from the DREAM3 challenge and then on real datasets related to the In vivo Reverse-Engineering and Modeling Assessment (IRMA) and T-cell networks. The high-quality results obtained strongly indicate that it outperforms existing approaches. The OKVAR-Boost Matlab code is available as the archive: http://amis-group.fr/sourcecode-okvar-boost/OKVARBoost-v1.0.zip. Supplementary data are available at Bioinformatics online.
Congested Link Inference Algorithms in Dynamic Routing IP Network
Directory of Open Access Journals (Sweden)
Yu Chen
2017-01-01
Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Correlated measurement error hampers association network inference
Kaduk, M.; Hoefsloot, H.C.J.; Vis, D.J.; Reijmers, T.; Greef, J. van der; Smilde, A.K.; Hendriks, M.M.W.B.
2014-01-01
Modern chromatography-based metabolomics measurements generate large amounts of data in the form of abundances of metabolites. An increasingly popular way of representing and analyzing such data is by means of association networks. Ideally, such a network can be interpreted in terms of the
MPE inference in conditional linear gaussian networks
DEFF Research Database (Denmark)
Salmerón, Antonio; Rumí, Rafael; Langseth, Helge
2015-01-01
Given evidence on a set of variables in a Bayesian network, the most probable explanation (MPE) is the problem of finding a configuration of the remaining variables with maximum posterior probability. This problem has previously been addressed for discrete Bayesian networks and can be solved using...
Causal inference in biology networks with integrated belief propagation.
Chang, Rui; Karr, Jonathan R; Schadt, Eric E
2015-01-01
Inferring causal relationships among molecular and higher order phenotypes is a critical step in elucidating the complexity of living systems. Here we propose a novel method for inferring causality that is no longer constrained by the conditional dependency arguments that limit the ability of statistical causal inference methods to resolve causal relationships within sets of graphical models that are Markov equivalent. Our method utilizes Bayesian belief propagation to infer the responses of perturbation events on molecular traits given a hypothesized graph structure. A distance measure between the inferred response distribution and the observed data is defined to assess the 'fitness' of the hypothesized causal relationships. To test our algorithm, we infer causal relationships within equivalence classes of gene networks in which the form of the functional interactions that are possible are assumed to be nonlinear, given synthetic microarray and RNA sequencing data. We also apply our method to infer causality in real metabolic network with v-structure and feedback loop. We show that our method can recapitulate the causal structure and recover the feedback loop only from steady-state data which conventional method cannot.
Inferring gene networks from discrete expression data
Zhang, L.; Mallick, B. K.
2013-01-01
graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which
Information-Theoretic Inference of Large Transcriptional Regulatory Networks
Directory of Open Access Journals (Sweden)
Meyer Patrick
2007-01-01
Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Information-Theoretic Inference of Large Transcriptional Regulatory Networks
Directory of Open Access Journals (Sweden)
Patrick E. Meyer
2007-06-01
Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Functional networks inference from rule-based machine learning models.
Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume
2016-01-01
Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The
Automatic physical inference with information maximizing neural networks
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
Causal Inference and Explaining Away in a Spiking Network
Moreno-Bote, Rubén; Drugowitsch, Jan
2015-01-01
While the brain uses spiking neurons for communication, theoretical research on brain computations has mostly focused on non-spiking networks. The nature of spike-based algorithms that achieve complex computations, such as object probabilistic inference, is largely unknown. Here we demonstrate that a family of high-dimensional quadratic optimization problems with non-negativity constraints can be solved exactly and efficiently by a network of spiking neurons. The network naturally imposes the non-negativity of causal contributions that is fundamental to causal inference, and uses simple operations, such as linear synapses with realistic time constants, and neural spike generation and reset non-linearities. The network infers the set of most likely causes from an observation using explaining away, which is dynamically implemented by spike-based, tuned inhibition. The algorithm performs remarkably well even when the network intrinsically generates variable spike trains, the timing of spikes is scrambled by external sources of noise, or the network is mistuned. This type of network might underlie tasks such as odor identification and classification. PMID:26621426
IPV6 Network Infrastructure and Stability Inference
2014-09-01
the Too Big Trick ( TBT ) to induce the remote targets to return fragmented responses. By evaluating the responses, the uptime for approximately 35% of...Internet. Approximately 50,000 IPv6 addresses were probed continuously from March to June 2014, using the Too Big Trick ( TBT ) to induce the remote targets...SNMP Simple Network Management Protocol TSval Timestamp Value TCP Transmission Control Protocol TBT Too Big Trick USG U.S. government xiv
Inferring the conservative causal core of gene regulatory networks
Directory of Open Access Journals (Sweden)
Emmert-Streib Frank
2010-09-01
Full Text Available Abstract Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Inferring the conservative causal core of gene regulatory networks.
Altay, Gökmen; Emmert-Streib, Frank
2010-09-28
Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Inferring the gene network underlying the branching of tomato inflorescence.
Directory of Open Access Journals (Sweden)
Laura Astola
Full Text Available The architecture of tomato inflorescence strongly affects flower production and subsequent crop yield. To understand the genetic activities involved, insight into the underlying network of genes that initiate and control the sympodial growth in the tomato is essential. In this paper, we show how the structure of this network can be derived from available data of the expressions of the involved genes. Our approach starts from employing biological expert knowledge to select the most probable gene candidates behind branching behavior. To find how these genes interact, we develop a stepwise procedure for computational inference of the network structure. Our data consists of expression levels from primary shoot meristems, measured at different developmental stages on three different genotypes of tomato. With the network inferred by our algorithm, we can explain the dynamics corresponding to all three genotypes simultaneously, despite their apparent dissimilarities. We also correctly predict the chronological order of expression peaks for the main hubs in the network. Based on the inferred network, using optimal experimental design criteria, we are able to suggest an informative set of experiments for further investigation of the mechanisms underlying branching behavior.
Inferring topologies of complex networks with hidden variables.
Wu, Xiaoqun; Wang, Weihan; Zheng, Wei Xing
2012-10-01
Network topology plays a crucial role in determining a network's intrinsic dynamics and function, thus understanding and modeling the topology of a complex network will lead to greater knowledge of its evolutionary mechanisms and to a better understanding of its behaviors. In the past few years, topology identification of complex networks has received increasing interest and wide attention. Many approaches have been developed for this purpose, including synchronization-based identification, information-theoretic methods, and intelligent optimization algorithms. However, inferring interaction patterns from observed dynamical time series is still challenging, especially in the absence of knowledge of nodal dynamics and in the presence of system noise. The purpose of this work is to present a simple and efficient approach to inferring the topologies of such complex networks. The proposed approach is called "piecewise partial Granger causality." It measures the cause-effect connections of nonlinear time series influenced by hidden variables. One commonly used testing network, two regular networks with a few additional links, and small-world networks are used to evaluate the performance and illustrate the influence of network parameters on the proposed approach. Application to experimental data further demonstrates the validity and robustness of our method.
Effective network inference through multivariate information transfer estimation
Dahlqvist, Carl-Henrik; Gnabo, Jean-Yves
2018-06-01
Network representation has steadily gained in popularity over the past decades. In many disciplines such as finance, genetics, neuroscience or human travel to cite a few, the network may not directly be observable and needs to be inferred from time-series data, leading to the issue of separating direct interactions between two entities forming the network from indirect interactions coming through its remaining part. Drawing on recent contributions proposing strategies to deal with this problem such as the so-called "global silencing" approach of Barzel and Barabasi or "network deconvolution" of Feizi et al. (2013), we propose a novel methodology to infer an effective network structure from multivariate conditional information transfers. Its core principal is to test the information transfer between two nodes through a step-wise approach by conditioning the transfer for each pair on a specific set of relevant nodes as identified by our algorithm from the rest of the network. The methodology is model free and can be applied to high-dimensional networks with both inter-lag and intra-lag relationships. It outperforms state-of-the-art approaches for eliminating the redundancies and more generally retrieving simulated artificial networks in our Monte-Carlo experiments. We apply the method to stock market data at different frequencies (15 min, 1 h, 1 day) to retrieve the network of US largest financial institutions and then document how bank's centrality measurements relate to bank's systemic vulnerability.
Inference of financial networks using the normalised mutual information rate
2018-01-01
In this paper, we study data from financial markets, using the normalised Mutual Information Rate. We show how to use it to infer the underlying network structure of interrelations in the foreign currency exchange rates and stock indices of 15 currency areas. We first present the mathematical method and discuss its computational aspects, and apply it to artificial data from chaotic dynamics and to correlated normal-variates data. We then apply the method to infer the structure of the financial system from the time-series of currency exchange rates and stock indices. In particular, we study and reveal the interrelations among the various foreign currency exchange rates and stock indices in two separate networks, of which we also study their structural properties. Our results show that both inferred networks are small-world networks, sharing similar properties and having differences in terms of assortativity. Importantly, our work shows that global economies tend to connect with other economies world-wide, rather than creating small groups of local economies. Finally, the consistent interrelations depicted among the 15 currency areas are further supported by a discussion from the viewpoint of economics. PMID:29420644
Inference of financial networks using the normalised mutual information rate.
Goh, Yong Kheng; Hasim, Haslifah M; Antonopoulos, Chris G
2018-01-01
In this paper, we study data from financial markets, using the normalised Mutual Information Rate. We show how to use it to infer the underlying network structure of interrelations in the foreign currency exchange rates and stock indices of 15 currency areas. We first present the mathematical method and discuss its computational aspects, and apply it to artificial data from chaotic dynamics and to correlated normal-variates data. We then apply the method to infer the structure of the financial system from the time-series of currency exchange rates and stock indices. In particular, we study and reveal the interrelations among the various foreign currency exchange rates and stock indices in two separate networks, of which we also study their structural properties. Our results show that both inferred networks are small-world networks, sharing similar properties and having differences in terms of assortativity. Importantly, our work shows that global economies tend to connect with other economies world-wide, rather than creating small groups of local economies. Finally, the consistent interrelations depicted among the 15 currency areas are further supported by a discussion from the viewpoint of economics.
Statistical inference to advance network models in epidemiology.
Welch, David; Bansal, Shweta; Hunter, David R
2011-03-01
Contact networks are playing an increasingly important role in the study of epidemiology. Most of the existing work in this area has focused on considering the effect of underlying network structure on epidemic dynamics by using tools from probability theory and computer simulation. This work has provided much insight on the role that heterogeneity in host contact patterns plays on infectious disease dynamics. Despite the important understanding afforded by the probability and simulation paradigm, this approach does not directly address important questions about the structure of contact networks such as what is the best network model for a particular mode of disease transmission, how parameter values of a given model should be estimated, or how precisely the data allow us to estimate these parameter values. We argue that these questions are best answered within a statistical framework and discuss the role of statistical inference in estimating contact networks from epidemiological data. Copyright © 2011 Elsevier B.V. All rights reserved.
Inferring personal economic status from social network location
Luo, Shaojun; Morone, Flaviano; Sarraute, Carlos; Travizano, Matías; Makse, Hernán A.
2017-05-01
It is commonly believed that patterns of social ties affect individuals' economic status. Here we translate this concept into an operational definition at the network level, which allows us to infer the economic well-being of individuals through a measure of their location and influence in the social network. We analyse two large-scale sources: telecommunications and financial data of a whole country's population. Our results show that an individual's location, measured as the optimal collective influence to the structural integrity of the social network, is highly correlated with personal economic status. The observed social network patterns of influence mimic the patterns of economic inequality. For pragmatic use and validation, we carry out a marketing campaign that shows a threefold increase in response rate by targeting individuals identified by our social network metrics as compared to random targeting. Our strategy can also be useful in maximizing the effects of large-scale economic stimulus policies.
Sensorimotor Network Crucial for Inferring Amusement from Smiles.
Paracampo, Riccardo; Tidoni, Emmanuele; Borgomaneri, Sara; di Pellegrino, Giuseppe; Avenanti, Alessio
2017-11-01
Understanding whether another's smile reflects authentic amusement is a key challenge in social life, yet, the neural bases of this ability have been largely unexplored. Here, we combined transcranial magnetic stimulation (TMS) with a novel empathic accuracy (EA) task to test whether sensorimotor and mentalizing networks are critical for understanding another's amusement. Participants were presented with dynamic displays of smiles and explicitly requested to infer whether the smiling individual was feeling authentic amusement or not. TMS over sensorimotor regions representing the face (i.e., in the inferior frontal gyrus (IFG) and ventral primary somatosensory cortex (SI)), disrupted the ability to infer amusement authenticity from observed smiles. The same stimulation did not affect performance on a nonsocial task requiring participants to track the smiling expression but not to infer amusement. Neither TMS over prefrontal and temporo-parietal areas supporting mentalizing, nor peripheral control stimulations, affected performance on either task. Thus, motor and somatosensory circuits for controlling and sensing facial movements are causally essential for inferring amusement from another's smile. These findings highlight the functional relevance of IFG and SI to amusement understanding and suggest that EA abilities may be grounded in sensorimotor networks for moving and feeling the body. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sign Inference for Dynamic Signed Networks via Dictionary Learning
Directory of Open Access Journals (Sweden)
Yi Cen
2013-01-01
Full Text Available Mobile online social network (mOSN is a burgeoning research area. However, most existing works referring to mOSNs deal with static network structures and simply encode whether relationships among entities exist or not. In contrast, relationships in signed mOSNs can be positive or negative and may be changed with time and locations. Applying certain global characteristics of social balance, in this paper, we aim to infer the unknown relationships in dynamic signed mOSNs and formulate this sign inference problem as a low-rank matrix estimation problem. Specifically, motivated by the Singular Value Thresholding (SVT algorithm, a compact dictionary is selected from the observed dataset. Based on this compact dictionary, the relationships in the dynamic signed mOSNs are estimated via solving the formulated problem. Furthermore, the estimation accuracy is improved by employing a dictionary self-updating mechanism.
An Algebraic Approach to Inference in Complex Networked Structures
2015-07-09
44], [45],[46] where the shift is the elementary non-trivial filter that generates, under an appropriate notion of shift invariance, all linear ... elementary filter, and its output is a graph signal with the value at vertex n of the graph given approximately by a weighted linear combination of...AFRL-AFOSR-VA-TR-2015-0265 An Algebraic Approach to Inference in Complex Networked Structures Jose Moura CARNEGIE MELLON UNIVERSITY Final Report 07
Inference of Transmission Network Structure from HIV Phylogenetic Trees.
Giardina, Federica; Romero-Severson, Ethan Obie; Albert, Jan; Britton, Tom; Leitner, Thomas
2017-01-01
Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures the transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.
Probabilistic Inference of Biological Networks via Data Integration
Directory of Open Access Journals (Sweden)
Mark F. Rogers
2015-01-01
Full Text Available There is significant interest in inferring the structure of subcellular networks of interaction. Here we consider supervised interactive network inference in which a reference set of known network links and nonlinks is used to train a classifier for predicting new links. Many types of data are relevant to inferring functional links between genes, motivating the use of data integration. We use pairwise kernels to predict novel links, along with multiple kernel learning to integrate distinct sources of data into a decision function. We evaluate various pairwise kernels to establish which are most informative and compare individual kernel accuracies with accuracies for weighted combinations. By associating a probability measure with classifier predictions, we enable cautious classification, which can increase accuracy by restricting predictions to high-confidence instances, and data cleaning that can mitigate the influence of mislabeled training instances. Although one pairwise kernel (the tensor product pairwise kernel appears to work best, different kernels may contribute complimentary information about interactions: experiments in S. cerevisiae (yeast reveal that a weighted combination of pairwise kernels applied to different types of data yields the highest predictive accuracy. Combined with cautious classification and data cleaning, we can achieve predictive accuracies of up to 99.6%.
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
Inferring the role of transcription factors in regulatory networks
Directory of Open Access Journals (Sweden)
Le Borgne Michel
2008-05-01
Full Text Available Abstract Background Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. Results We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges, and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions, by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions. In addition, we report predictions for 14.5% of all interactions. Conclusion Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine
Mathematical inference and control of molecular networks from perturbation experiments
Mohammed-Rasheed, Mohammed
One of the main challenges facing biologists and mathematicians in the post genomic era is to understand the behavior of molecular networks and harness this understanding into an educated intervention of the cell. The cell maintains its function via an elaborate network of interconnecting positive and negative feedback loops of genes, RNA and proteins that send different signals to a large number of pathways and molecules. These structures are referred to as genetic regulatory networks (GRNs) or molecular networks. GRNs can be viewed as dynamical systems with inherent properties and mechanisms, such as steady-state equilibriums and stability, that determine the behavior of the cell. The biological relevance of the mathematical concepts are important as they may predict the differentiation of a stem cell, the maintenance of a normal cell, the development of cancer and its aberrant behavior, and the design of drugs and response to therapy. Uncovering the underlying GRN structure from gene/protein expression data, e.g., microarrays or perturbation experiments, is called inference or reverse engineering of the molecular network. Because of the high cost and time consuming nature of biological experiments, the number of available measurements or experiments is very small compared to the number of molecules (genes, RNA and proteins). In addition, the observations are noisy, where the noise is due to the measurements imperfections as well as the inherent stochasticity of genetic expression levels. Intra-cellular activities and extra-cellular environmental attributes are also another source of variability. Thus, the inference of GRNs is, in general, an under-determined problem with a highly noisy set of observations. The ultimate goal of GRN inference and analysis is to be able to intervene within the network, in order to force it away from undesirable cellular states and into desirable ones. However, it remains a major challenge to design optimal intervention strategies
Qualitative reasoning for biological network inference from systematic perturbation experiments.
Badaloni, Silvana; Di Camillo, Barbara; Sambo, Francesco
2012-01-01
The systematic perturbation of the components of a biological system has been proven among the most informative experimental setups for the identification of causal relations between the components. In this paper, we present Systematic Perturbation-Qualitative Reasoning (SPQR), a novel Qualitative Reasoning approach to automate the interpretation of the results of systematic perturbation experiments. Our method is based on a qualitative abstraction of the experimental data: for each perturbation experiment, measured values of the observed variables are modeled as lower, equal or higher than the measurements in the wild type condition, when no perturbation is applied. The algorithm exploits a set of IF-THEN rules to infer causal relations between the variables, analyzing the patterns of propagation of the perturbation signals through the biological network, and is specifically designed to minimize the rate of false positives among the inferred relations. Tested on both simulated and real perturbation data, SPQR indeed exhibits a significantly higher precision than the state of the art.
Network inference from functional experimental data (Conference Presentation)
Desrosiers, Patrick; Labrecque, Simon; Tremblay, Maxime; Bélanger, Mathieu; De Dorlodot, Bertrand; Côté, Daniel C.
2016-03-01
Functional connectivity maps of neuronal networks are critical tools to understand how neurons form circuits, how information is encoded and processed by neurons, how memory is shaped, and how these basic processes are altered under pathological conditions. Current light microscopy allows to observe calcium or electrical activity of thousands of neurons simultaneously, yet assessing comprehensive connectivity maps directly from such data remains a non-trivial analytical task. There exist simple statistical methods, such as cross-correlation and Granger causality, but they only detect linear interactions between neurons. Other more involved inference methods inspired by information theory, such as mutual information and transfer entropy, identify more accurately connections between neurons but also require more computational resources. We carried out a comparative study of common connectivity inference methods. The relative accuracy and computational cost of each method was determined via simulated fluorescence traces generated with realistic computational models of interacting neurons in networks of different topologies (clustered or non-clustered) and sizes (10-1000 neurons). To bridge the computational and experimental works, we observed the intracellular calcium activity of live hippocampal neuronal cultures infected with the fluorescent calcium marker GCaMP6f. The spontaneous activity of the networks, consisting of 50-100 neurons per field of view, was recorded from 20 to 50 Hz on a microscope controlled by a homemade software. We implemented all connectivity inference methods in the software, which rapidly loads calcium fluorescence movies, segments the images, extracts the fluorescence traces, and assesses the functional connections (with strengths and directions) between each pair of neurons. We used this software to assess, in real time, the functional connectivity from real calcium imaging data in basal conditions, under plasticity protocols, and epileptic
Assessment of network inference methods: how to cope with an underdetermined problem.
Directory of Open Access Journals (Sweden)
Caroline Siegenthaler
Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks
Energy Technology Data Exchange (ETDEWEB)
Kim, M; Leskovec, J
2011-11-14
Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.
Comparison of evolutionary algorithms in gene regulatory network model inference.
LENUS (Irish Health Repository)
2010-01-01
ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.
Supervised dictionary learning for inferring concurrent brain networks.
Zhao, Shijie; Han, Junwei; Lv, Jinglei; Jiang, Xi; Hu, Xintao; Zhao, Yu; Ge, Bao; Guo, Lei; Liu, Tianming
2015-10-01
Task-based fMRI (tfMRI) has been widely used to explore functional brain networks via predefined stimulus paradigm in the fMRI scan. Traditionally, the general linear model (GLM) has been a dominant approach to detect task-evoked networks. However, GLM focuses on task-evoked or event-evoked brain responses and possibly ignores the intrinsic brain functions. In comparison, dictionary learning and sparse coding methods have attracted much attention recently, and these methods have shown the promise of automatically and systematically decomposing fMRI signals into meaningful task-evoked and intrinsic concurrent networks. Nevertheless, two notable limitations of current data-driven dictionary learning method are that the prior knowledge of task paradigm is not sufficiently utilized and that the establishment of correspondences among dictionary atoms in different brains have been challenging. In this paper, we propose a novel supervised dictionary learning and sparse coding method for inferring functional networks from tfMRI data, which takes both of the advantages of model-driven method and data-driven method. The basic idea is to fix the task stimulus curves as predefined model-driven dictionary atoms and only optimize the other portion of data-driven dictionary atoms. Application of this novel methodology on the publicly available human connectome project (HCP) tfMRI datasets has achieved promising results.
Perturbation Biology: Inferring Signaling Networks in Cellular Systems
Miller, Martin L.; Gauthier, Nicholas P.; Jing, Xiaohong; Kaushik, Poorvi; He, Qin; Mills, Gordon; Solit, David B.; Pratilas, Christine A.; Weigt, Martin; Braunstein, Alfredo; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris
2013-01-01
We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology. PMID:24367245
Goyal, Ravi; De Gruttola, Victor
2018-01-30
Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Approximation Methods for Inference and Learning in Belief Networks: Progress and Future Directions
National Research Council Canada - National Science Library
Pazzan, Michael
1997-01-01
.... In this research project, we have investigated methods and implemented algorithms for efficiently making certain classes of inference in belief networks, and for automatically learning certain...
Data identification for improving gene network inference using computational algebra.
Dimitrova, Elena; Stigler, Brandilyn
2014-11-01
Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks
Directory of Open Access Journals (Sweden)
Alina Sîrbu
2015-05-01
Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks.
Sîrbu, Alina; Crane, Martin; Ruskin, Heather J
2015-05-14
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium
Energy Technology Data Exchange (ETDEWEB)
Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.
2009-04-20
Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.
DEFF Research Database (Denmark)
Møller, Jesper
(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Connectivity in the yeast cell cycle transcription network: inferences from neural networks.
Directory of Open Access Journals (Sweden)
Christopher E Hart
2006-12-01
Full Text Available A current challenge is to develop computational approaches to infer gene network regulatory relationships based on multiple types of large-scale functional genomic data. We find that single-layer feed-forward artificial neural network (ANN models can effectively discover gene network structure by integrating global in vivo protein:DNA interaction data (ChIP/Array with genome-wide microarray RNA data. We test this on the yeast cell cycle transcription network, which is composed of several hundred genes with phase-specific RNA outputs. These ANNs were robust to noise in data and to a variety of perturbations. They reliably identified and ranked 10 of 12 known major cell cycle factors at the top of a set of 204, based on a sum-of-squared weights metric. Comparative analysis of motif occurrences among multiple yeast species independently confirmed relationships inferred from ANN weights analysis. ANN models can capitalize on properties of biological gene networks that other kinds of models do not. ANNs naturally take advantage of patterns of absence, as well as presence, of factor binding associated with specific expression output; they are easily subjected to in silico "mutation" to uncover biological redundancies; and they can use the full range of factor binding values. A prominent feature of cell cycle ANNs suggested an analogous property might exist in the biological network. This postulated that "network-local discrimination" occurs when regulatory connections (here between MBF and target genes are explicitly disfavored in one network module (G2, relative to others and to the class of genes outside the mitotic network. If correct, this predicts that MBF motifs will be significantly depleted from the discriminated class and that the discrimination will persist through evolution. Analysis of distantly related Schizosaccharomyces pombe confirmed this, suggesting that network-local discrimination is real and complements well-known enrichment of
Inferring influenza global transmission networks without complete phylogenetic information.
Aris-Brosou, Stéphane
2014-03-01
Influenza is one of the most severe respiratory infections affecting humans throughout the world, yet the dynamics of its global transmission network are still contentious. Here, I describe a novel combination of phylogenetics, time series, and graph theory to analyze 14.25 years of data stratified in space and in time, focusing on the main target of the human immune response, the hemagglutinin gene. While bypassing the complete phylogenetic inference of huge data sets, the method still extracts information suggesting that waves of genetic or of nucleotide diversity circulate continuously around the globe for subtypes that undergo sustained transmission over several seasons, such as H3N2 and pandemic H1N1/09, while diversity of prepandemic H1N1 viruses had until 2009 a noncontinuous transmission pattern consistent with a source/sink model. Irrespective of the shift in the structure of H1N1 diversity circulation with the emergence of the pandemic H1N1/09 strain, US prevalence peaks during the winter months when genetic diversity is at its lowest. This suggests that a dominant strain is generally responsible for epidemics and that monitoring genetic and/or nucleotide diversity in real time could provide public health agencies with an indirect estimate of prevalence.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Inference of neuronal network spike dynamics and topology from calcium imaging data
Directory of Open Access Journals (Sweden)
Henry eLütcke
2013-12-01
Full Text Available Two-photon calcium imaging enables functional analysis of neuronal circuits by inferring action potential (AP occurrence ('spike trains' from cellular fluorescence signals. It remains unclear how experimental parameters such as signal-to-noise ratio (SNR and acquisition rate affect spike inference and whether additional information about network structure can be extracted. Here we present a simulation framework for quantitatively assessing how well spike dynamics and network topology can be inferred from noisy calcium imaging data. For simulated AP-evoked calcium transients in neocortical pyramidal cells, we analyzed the quality of spike inference as a function of SNR and data acquisition rate using a recently introduced peeling algorithm. Given experimentally attainable values of SNR and acquisition rate, neural spike trains could be reconstructed accurately and with up to millisecond precision. We then applied statistical neuronal network models to explore how remaining uncertainties in spike inference affect estimates of network connectivity and topological features of network organization. We define the experimental conditions suitable for inferring whether the network has a scale-free structure and determine how well hub neurons can be identified. Our findings provide a benchmark for future calcium imaging studies that aim to reliably infer neuronal network properties.
Multi-Agent Inference in Social Networks: A Finite Population Learning Approach.
Fan, Jianqing; Tong, Xin; Zeng, Yao
When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people's incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning , to address whether with high probability, a large fraction of people in a given finite population network can make "good" inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows.
Learning a Markov Logic network for supervised gene regulatory network inference.
Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence
2013-09-12
Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a
Inference method using bayesian network for diagnosis of pulmonary nodules
International Nuclear Information System (INIS)
Kawagishi, Masami; Iizuka, Yoshio; Yamamoto, Hiroyuki; Yakami, Masahiro; Kubo, Takeshi; Fujimoto, Koji; Togashi, Kaori
2010-01-01
This report describes the improvements of a naive Bayes model that infers the diagnosis of pulmonary nodules in chest CT images based on the findings obtained when a radiologist interprets the CT images. We have previously introduced an inference model using a naive Bayes classifier and have reported its clinical value based on evaluation using clinical data. In the present report, we introduce the following improvements to the original inference model: the selection of findings based on correlations and the generation of a model using only these findings, and the introduction of classifiers that integrate several simple classifiers each of which is specialized for specific diagnosis. These improvements were found to increase the inference accuracy by 10.4% (p<.01) as compared to the original model in 100 cases (222 nodules) based on leave-one-out evaluation. (author)
Efficient design and inference in distributed Bayesian networks: an overview
de Oude, P.; Groen, F.C.A.; Pavlin, G.; Bezhanishvili, N.; Löbner, S.; Schwabe, K.; Spada, L.
2011-01-01
This paper discusses an approach to distributed Bayesian modeling and inference, which is relevant for an important class of contemporary real world situation assessment applications. By explicitly considering the locality of causal relations, the presented approach (i) supports coherent distributed
Inferring Trust Relationships in Web-Based Social Networks
National Research Council Canada - National Science Library
Golbeck, Jennifer; Hendler, James
2006-01-01
The growth of web-based social networking and the properties of those networks have created great potential for producing intelligent software that integrates a user's social network and preferences...
Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana
Muraro, D.; Voss, U.; Wilson, M.; Bennett, M.; Byrne, H.; De Smet, I.; Hodgman, C.; King, J.
2013-01-01
thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based
Mocapy++ - a toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks
Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro
2016-01-01
In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem
Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang
2011-12-01
An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away") and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.
Inferring causal molecular networks: empirical assessment through a community-based effort.
Hill, Steven M; Heiser, Laura M; Cokelaer, Thomas; Unger, Michael; Nesser, Nicole K; Carlin, Daniel E; Zhang, Yang; Sokolov, Artem; Paull, Evan O; Wong, Chris K; Graim, Kiley; Bivol, Adrian; Wang, Haizhou; Zhu, Fan; Afsari, Bahman; Danilova, Ludmila V; Favorov, Alexander V; Lee, Wai Shing; Taylor, Dane; Hu, Chenyue W; Long, Byron L; Noren, David P; Bisberg, Alexander J; Mills, Gordon B; Gray, Joe W; Kellen, Michael; Norman, Thea; Friend, Stephen; Qutub, Amina A; Fertig, Elana J; Guan, Yuanfang; Song, Mingzhou; Stuart, Joshua M; Spellman, Paul T; Koeppl, Heinz; Stolovitzky, Gustavo; Saez-Rodriguez, Julio; Mukherjee, Sach
2016-04-01
It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.
Efficient inference of overlapping communities in complex networks
DEFF Research Database (Denmark)
Fruergaard, Bjarne Ørum; Herlau, Tue
2014-01-01
We discuss two views on extending existing methods for complex network modeling which we dub the communities first and the networks first view, respectively. Inspired by the networks first view that we attribute to White, Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic...
Jia, Bin; Wang, Xiaodong
2013-12-17
: The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.
Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.
Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A
2017-08-07
High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier
Orhan, A Emin; Ma, Wei Ji
2017-07-26
Animals perform near-optimal probabilistic inference in a wide range of psychophysical tasks. Probabilistic inference requires trial-to-trial representation of the uncertainties associated with task variables and subsequent use of this representation. Previous work has implemented such computations using neural networks with hand-crafted and task-dependent operations. We show that generic neural networks trained with a simple error-based learning rule perform near-optimal probabilistic inference in nine common psychophysical tasks. In a probabilistic categorization task, error-based learning in a generic network simultaneously explains a monkey's learning curve and the evolution of qualitative aspects of its choice behavior. In all tasks, the number of neurons required for a given level of performance grows sublinearly with the input population size, a substantial improvement on previous implementations of probabilistic inference. The trained networks develop a novel sparsity-based probabilistic population code. Our results suggest that probabilistic inference emerges naturally in generic neural networks trained with error-based learning rules.Behavioural tasks often require probability distributions to be inferred about task specific variables. Here, the authors demonstrate that generic neural networks can be trained using a simple error-based learning rule to perform such probabilistic computations efficiently without any need for task specific operations.
Software defined network inference with evolutionary optimal observation matrices
Malboubi, M; Gong, Y; Yang, Z; Wang, X; Chuah, CN; Sharma, P
2017-01-01
© 2017 Elsevier B.V. A key requirement for network management is the accurate and reliable monitoring of relevant network characteristics. In today's large-scale networks, this is a challenging task due to the scarcity of network measurement resources and the hard constraints that this imposes. This paper proposes a new framework, called SNIPER, which leverages the flexibility provided by Software-Defined Networking (SDN) to design the optimal observation or measurement matrix that can lead t...
Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che
2014-01-16
To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high
Bottlenecks and Hubs in Inferred Networks Are Important for Virulence in Salmonella typhimurium
Energy Technology Data Exchange (ETDEWEB)
McDermott, Jason E.; Taylor, Ronald C.; Yoon, Hyunjin; Heffron, Fred
2009-02-01
Recent advances in experimental methods have provided sufficient data to consider systems as large networks of interconnected components. High-throughput determination of protein-protein interaction networks has led to the observation that topological bottlenecks, that is proteins defined by high centrality in the network, are enriched in proteins with systems-level phenotypes such as essentiality. Global transcriptional profiling by microarray analysis has been used extensively to characterize systems, for example, cellular response to environmental conditions and genetic mutations. These transcriptomic datasets have been used to infer regulatory and functional relationship networks based on co-regulation. We use the context likelihood of relatedness (CLR) method to infer networks from two datasets gathered from the pathogen Salmonella typhimurium; one under a range of environmental culture conditions and the other from deletions of 15 regulators found to be essential in virulence. Bottleneck nodes were identified from these inferred networks and we show that these nodes are significantly more likely to be essential for virulence than their non-bottleneck counterparts. A network generated using Pearson correlation did not display this behavior. Overall this study demonstrates that topology of networks inferred from global transcriptional profiles provides information about the systems-level roles of bottleneck genes. Analysis of the differences between the two CLR-derived networks suggests that the bottleneck nodes are either mediators of transitions between system states or sentinels that reflect the dynamics of these transitions.
Inferring Structure and Forecasting Dynamics on Evolving Networks
2016-01-05
Graphs ........................................................................................................................ 23 7. Sacred Values...5) Team Formation; (6) Games of Graphs; (7) Sacred Values and Legitimacy in Network Interactions; (8) Network processes in Geo-Social Context. 1...Authority, Cooperation and Competition in Religious Networks Key Papers: McBride 2015a [72] and McBride 2015b [73] McBride (2015a) examines
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
Energy Technology Data Exchange (ETDEWEB)
Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)
2014-05-20
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks
International Nuclear Information System (INIS)
Santra, Tapesh
2014-01-01
Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
Peng, Yefei
2010-01-01
An ontology mapping neural network (OMNN) is proposed in order to learn and infer correspondences among ontologies. It extends the Identical Elements Neural Network (IENN)'s ability to represent and map complex relationships. The learning dynamics of simultaneous (interlaced) training of similar tasks interact at the shared connections of the…
Directory of Open Access Journals (Sweden)
Ji Wei
2010-10-01
Full Text Available Abstract Background Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discretization has been conducted, in the context of gene regulatory network inference from time series gene expression data. Results In this study, we propose a new discretization method "bikmeans", and compare its performance with four other widely-used discretization methods using different datasets, modeling algorithms and number of intervals. Sensitivities, specificities and total accuracies were calculated and statistical analysis was carried out. Bikmeans method always gave high total accuracies. Conclusions Our results indicate that proper discretization methods can consistently improve gene regulatory network inference independent of network modeling algorithms and datasets. Our new method, bikmeans, resulted in significant better total accuracies than other methods.
Frank, Laurence Emmanuelle
2006-01-01
Feature Network Models (FNM) are graphical structures that represent proximity data in a discrete space with the use of features. A statistical inference theory is introduced, based on the additivity properties of networks and the linear regression framework. Considering features as predictor
Inference of gene regulatory networks from time series by Tsallis entropy
Directory of Open Access Journals (Sweden)
de Oliveira Evaldo A
2011-05-01
Full Text Available Abstract Background The inference of gene regulatory networks (GRNs from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information, a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5
Saxena, Anupam; Lipson, Hod; Valero-Cuevas, Francisco J
2012-01-01
In systems and computational biology, much effort is devoted to functional identification of systems and networks at the molecular-or cellular scale. However, similarly important networks exist at anatomical scales such as the tendon network of human fingers: the complex array of collagen fibers that transmits and distributes muscle forces to finger joints. This network is critical to the versatility of the human hand, and its function has been debated since at least the 16(th) century. Here, we experimentally infer the structure (both topology and parameter values) of this network through sparse interrogation with force inputs. A population of models representing this structure co-evolves in simulation with a population of informative future force inputs via the predator-prey estimation-exploration algorithm. Model fitness depends on their ability to explain experimental data, while the fitness of future force inputs depends on causing maximal functional discrepancy among current models. We validate our approach by inferring two known synthetic Latex networks, and one anatomical tendon network harvested from a cadaver's middle finger. We find that functionally similar but structurally diverse models can exist within a narrow range of the training set and cross-validation errors. For the Latex networks, models with low training set error [functional structure of complex anatomical networks. This work expands current bioinformatics inference approaches by demonstrating that sparse, yet informative interrogation of biological specimens holds significant computational advantages in accurate and efficient inference over random testing, or assuming model topology and only inferring parameters values. These findings also hold clues to both our evolutionary history and the development of versatile machines.
Directory of Open Access Journals (Sweden)
Junha Shin
Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co
Inferring topologies via driving-based generalized synchronization of two-layer networks
Wang, Yingfei; Wu, Xiaoqun; Feng, Hui; Lu, Jun-an; Xu, Yuhua
2016-05-01
The interaction topology among the constituents of a complex network plays a crucial role in the network’s evolutionary mechanisms and functional behaviors. However, some network topologies are usually unknown or uncertain. Meanwhile, coupling delays are ubiquitous in various man-made and natural networks. Hence, it is necessary to gain knowledge of the whole or partial topology of a complex dynamical network by taking into consideration communication delay. In this paper, topology identification of complex dynamical networks is investigated via generalized synchronization of a two-layer network. Particularly, based on the LaSalle-type invariance principle of stochastic differential delay equations, an adaptive control technique is proposed by constructing an auxiliary layer and designing proper control input and updating laws so that the unknown topology can be recovered upon successful generalized synchronization. Numerical simulations are provided to illustrate the effectiveness of the proposed method. The technique provides a certain theoretical basis for topology inference of complex networks. In particular, when the considered network is composed of systems with high-dimension or complicated dynamics, a simpler response layer can be constructed, which is conducive to circuit design. Moreover, it is practical to take into consideration perturbations caused by control input. Finally, the method is applicable to infer topology of a subnetwork embedded within a complex system and locate hidden sources. We hope the results can provide basic insight into further research endeavors on understanding practical and economical topology inference of networks.
Vlaic, Sebastian; Hoffmann, Bianca; Kupfer, Peter; Weber, Michael; Dräger, Andreas
2013-09-01
GRN2SBML automatically encodes gene regulatory networks derived from several inference tools in systems biology markup language. Providing a graphical user interface, the networks can be annotated via the simple object access protocol (SOAP)-based application programming interface of BioMart Central Portal and minimum information required in the annotation of models registry. Additionally, we provide an R-package, which processes the output of supported inference algorithms and automatically passes all required parameters to GRN2SBML. Therefore, GRN2SBML closes a gap in the processing pipeline between the inference of gene regulatory networks and their subsequent analysis, visualization and storage. GRN2SBML is freely available under the GNU Public License version 3 and can be downloaded from http://www.hki-jena.de/index.php/0/2/490. General information on GRN2SBML, examples and tutorials are available at the tool's web page.
An analysis pipeline for the inference of protein-protein interaction networks
Energy Technology Data Exchange (ETDEWEB)
Taylor, Ronald C.; Singhal, Mudita; Daly, Don S.; Gilmore, Jason M.; Cannon, William R.; Domico, Kelly O.; White, Amanda M.; Auberry, Deanna L.; Auberry, Kenneth J.; Hooker, Brian S.; Hurst, G. B.; McDermott, Jason E.; McDonald, W. H.; Pelletier, Dale A.; Schmoyer, Denise A.; Wiley, H. S.
2009-12-01
An analysis pipeline has been created for deployment of a novel algorithm, the Bayesian Estimator of Protein-Protein Association Probabilities (BEPro), for use in the reconstruction of protein-protein interaction networks. We have combined the Software Environment for BIological Network Inference (SEBINI), an interactive environment for the deployment and testing of network inference algorithms that use high-throughput data, and the Collective Analysis of Biological Interaction Networks (CABIN), software that allows integration and analysis of protein-protein interaction and gene-to-gene regulatory evidence obtained from multiple sources, to allow interactions computed by BEPro to be stored, visualized, and further analyzed. Incorporating BEPro into SEBINI and automatically feeding the resulting inferred network into CABIN, we have created a structured workflow for protein-protein network inference and supplemental analysis from sets of mass spectrometry bait-prey experiment data. SEBINI demo site: https://www.emsl.pnl.gov /SEBINI/ Contact: ronald.taylor@pnl.gov. BEPro is available at http://www.pnl.gov/statistics/BEPro3/index.htm. Contact: ds.daly@pnl.gov. CABIN is available at http://www.sysbio.org/dataresources/cabin.stm. Contact: mudita.singhal@pnl.gov.
Directory of Open Access Journals (Sweden)
Yinyin Yuan
Full Text Available Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/.
A Hierarchical Poisson Log-Normal Model for Network Inference from RNA Sequencing Data
Gallopin, Mélina; Rau, Andrea; Jaffrézic, Florence
2013-01-01
Gene network inference from transcriptomic data is an important methodological challenge and a key aspect of systems biology. Although several methods have been proposed to infer networks from microarray data, there is a need for inference methods able to model RNA-seq data, which are count-based and highly variable. In this work we propose a hierarchical Poisson log-normal model with a Lasso penalty to infer gene networks from RNA-seq data; this model has the advantage of directly modelling discrete data and accounting for inter-sample variance larger than the sample mean. Using real microRNA-seq data from breast cancer tumors and simulations, we compare this method to a regularized Gaussian graphical model on log-transformed data, and a Poisson log-linear graphical model with a Lasso penalty on power-transformed data. For data simulated with large inter-sample dispersion, the proposed model performs better than the other methods in terms of sensitivity, specificity and area under the ROC curve. These results show the necessity of methods specifically designed for gene network inference from RNA-seq data. PMID:24147011
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Infering and Calibrating Triadic Closure in a Dynamic Network
Mantzaris, Alexander V.; Higham, Desmond J.
In the social sciences, the hypothesis of triadic closure contends that new links in a social contact network arise preferentially between those who currently share neighbours. Here, in a proof-of-principle study, we show how to calibrate a recently proposed evolving network model to time-dependent connectivity data. The probabilistic edge birth rate in the model contains a triadic closure term, so we are also able to assess statistically the evidence for this effect. The approach is shown to work on data generated synthetically from the model. We then apply this methodology to some real, large-scale data that records the build up of connections in a business-related social networking site, and find evidence for triadic closure.
The causal inference of cortical neural networks during music improvisations.
Wan, Xiaogeng; Crüts, Björn; Jensen, Henrik Jeldtoft
2014-01-01
We present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music) or in improvisation. Each piece of music was performed in two different modes: strict mode and "let-go" mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional Mutual Information from Mixed Embedding (MIME), to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and "let-go" mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found a difference between the response of musicians and the listeners when comparing the different playing conditions.
The Causal Inference of Cortical Neural Networks during Music Improvisations
Wan, Xiaogeng; Crüts, Björn; Jensen, Henrik Jeldtoft
2014-01-01
We present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music) or in improvisation. Each piece of music was performed in two different modes: strict mode and “let-go” mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional Mutual Information from Mixed Embedding (MIME), to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and “let-go” mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found a difference between the response of musicians and the listeners when comparing the different playing conditions. PMID:25489852
Network Inference and Maximum Entropy Estimation on Information Diagrams
Czech Academy of Sciences Publication Activity Database
Martin, E.A.; Hlinka, J.; Meinke, A.; Děchtěrenko, Filip; Tintěra, J.; Oliver, I.; Davidsen, J.
2017-01-01
Roč. 7, č. 1 (2017), s. 1-15, č. článku 7062. ISSN 2045-2322 R&D Projects: GA ČR GA13-23940S Institutional support: RVO:68081740 Keywords : complex networks * mutual information * entropy maximization * fMRI Subject RIV: AN - Psychology OBOR OECD: Cognitive sciences Impact factor: 4.259, year: 2016
Unifying Inference of Meso-Scale Structures in Networks.
Tunç, Birkan; Verma, Ragini
2015-01-01
Networks are among the most prevalent formal representations in scientific studies, employed to depict interactions between objects such as molecules, neuronal clusters, or social groups. Studies performed at meso-scale that involve grouping of objects based on their distinctive interaction patterns form one of the main lines of investigation in network science. In a social network, for instance, meso-scale structures can correspond to isolated social groupings or groups of individuals that serve as a communication core. Currently, the research on different meso-scale structures such as community and core-periphery structures has been conducted via independent approaches, which precludes the possibility of an algorithmic design that can handle multiple meso-scale structures and deciding which structure explains the observed data better. In this study, we propose a unified formulation for the algorithmic detection and analysis of different meso-scale structures. This facilitates the investigation of hybrid structures that capture the interplay between multiple meso-scale structures and statistical comparison of competing structures, all of which have been hitherto unavailable. We demonstrate the applicability of the methodology in analyzing the human brain network, by determining the dominant organizational structure (communities) of the brain, as well as its auxiliary characteristics (core-periphery).
Unifying Inference of Meso-Scale Structures in Networks.
Directory of Open Access Journals (Sweden)
Birkan Tunç
Full Text Available Networks are among the most prevalent formal representations in scientific studies, employed to depict interactions between objects such as molecules, neuronal clusters, or social groups. Studies performed at meso-scale that involve grouping of objects based on their distinctive interaction patterns form one of the main lines of investigation in network science. In a social network, for instance, meso-scale structures can correspond to isolated social groupings or groups of individuals that serve as a communication core. Currently, the research on different meso-scale structures such as community and core-periphery structures has been conducted via independent approaches, which precludes the possibility of an algorithmic design that can handle multiple meso-scale structures and deciding which structure explains the observed data better. In this study, we propose a unified formulation for the algorithmic detection and analysis of different meso-scale structures. This facilitates the investigation of hybrid structures that capture the interplay between multiple meso-scale structures and statistical comparison of competing structures, all of which have been hitherto unavailable. We demonstrate the applicability of the methodology in analyzing the human brain network, by determining the dominant organizational structure (communities of the brain, as well as its auxiliary characteristics (core-periphery.
The causal inference of cortical neural networks during music improvisations.
Directory of Open Access Journals (Sweden)
Xiaogeng Wan
Full Text Available We present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music or in improvisation. Each piece of music was performed in two different modes: strict mode and "let-go" mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional Mutual Information from Mixed Embedding (MIME, to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and "let-go" mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found a difference between the response of musicians and the listeners when comparing the different playing conditions.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Directory of Open Access Journals (Sweden)
Joeri Ruyssinck
Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.
Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming
2014-11-30
Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .
Fine-granularity inference and estimations to network traffic for SDN.
Directory of Open Access Journals (Sweden)
Dingde Jiang
Full Text Available An end-to-end network traffic matrix is significantly helpful for network management and for Software Defined Networks (SDN. However, the end-to-end network traffic matrix's inferences and estimations are a challenging problem. Moreover, attaining the traffic matrix in high-speed networks for SDN is a prohibitive challenge. This paper investigates how to estimate and recover the end-to-end network traffic matrix in fine time granularity from the sampled traffic traces, which is a hard inverse problem. Different from previous methods, the fractal interpolation is used to reconstruct the finer-granularity network traffic. Then, the cubic spline interpolation method is used to obtain the smooth reconstruction values. To attain an accurate the end-to-end network traffic in fine time granularity, we perform a weighted-geometric-average process for two interpolation results that are obtained. The simulation results show that our approaches are feasible and effective.
Fine-granularity inference and estimations to network traffic for SDN.
Jiang, Dingde; Huo, Liuwei; Li, Ya
2018-01-01
An end-to-end network traffic matrix is significantly helpful for network management and for Software Defined Networks (SDN). However, the end-to-end network traffic matrix's inferences and estimations are a challenging problem. Moreover, attaining the traffic matrix in high-speed networks for SDN is a prohibitive challenge. This paper investigates how to estimate and recover the end-to-end network traffic matrix in fine time granularity from the sampled traffic traces, which is a hard inverse problem. Different from previous methods, the fractal interpolation is used to reconstruct the finer-granularity network traffic. Then, the cubic spline interpolation method is used to obtain the smooth reconstruction values. To attain an accurate the end-to-end network traffic in fine time granularity, we perform a weighted-geometric-average process for two interpolation results that are obtained. The simulation results show that our approaches are feasible and effective.
Probabilistic reasoning in intelligent systems networks of plausible inference
Pearl, Judea
1988-01-01
Probabilistic Reasoning in Intelligent Systems is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertainty--and offers techniques, based on belief networks, that provid
Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E
2015-09-29
In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Structural influence of gene networks on their inference: analysis of C3NET
Directory of Open Access Journals (Sweden)
Emmert-Streib Frank
2011-06-01
Full Text Available Abstract Background The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. Results In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. Conclusions The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods. Reviewers This article was reviewed by Lev Klebanov, Joel Bader and Yuriy Gusev.
Network Model-Assisted Inference from Respondent-Driven Sampling Data.
Gile, Krista J; Handcock, Mark S
2015-06-01
Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.
Inference of Transcriptional Network for Pluripotency in Mouse Embryonic Stem Cells
International Nuclear Information System (INIS)
Aburatani, S
2015-01-01
In embryonic stem cells, various transcription factors (TFs) maintain pluripotency. To gain insights into the regulatory system controlling pluripotency, I inferred the regulatory relationships between the TFs expressed in ES cells. In this study, I applied a method based on structural equation modeling (SEM), combined with factor analysis, to 649 expression profiles of 19 TF genes measured in mouse Embryonic Stem Cells (ESCs). The factor analysis identified 19 TF genes that were regulated by several unmeasured factors. Since the known cell reprogramming TF genes (Pou5f1, Sox2 and Nanog) are regulated by different factors, each estimated factor is considered to be an input for signal transduction to control pluripotency in mouse ESCs. In the inferred network model, TF proteins were also arranged as unmeasured factors that control other TFs. The interpretation of the inferred network model revealed the regulatory mechanism for controlling pluripotency in ES cells
Network Model-Assisted Inference from Respondent-Driven Sampling Data
Gile, Krista J.; Handcock, Mark S.
2015-01-01
Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328
Energy Technology Data Exchange (ETDEWEB)
Nakamura, K; Iwai, S
1982-01-01
In information retrieval system, it is necessary to grasp user's subject of interest in order to present appropriate documents to the user. In this paper, the authors propose a model of human ability of analogical inference based on association between key words and, using it, construct an information retrieval system in which the computer with the ability learns its user's subject of interest through question-answering with the user. In this system, the association between key words is represented by a semantic network, and a function of fuzzification of input information is introduced in the semantic network to implement the ability of analogical inference based on the association. Finally, the effect of analogical inference on the learning efficiency of the system is investigated. 5 references.
Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis
2012-01-01
Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A; Kellis, Manolis
2012-07-01
Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein-protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.
Network Inference and Maximum Entropy Estimation on Information Diagrams
Czech Academy of Sciences Publication Activity Database
Martin, E.A.; Hlinka, Jaroslav; Meinke, A.; Děchtěrenko, Filip; Tintěra, J.; Oliver, I.; Davidsen, J.
2017-01-01
Roč. 7, č. 1 (2017), č. článku 7062. ISSN 2045-2322 R&D Projects: GA ČR GA13-23940S; GA MZd(CZ) NV15-29835A Grant - others:GA MŠk(CZ) LO1611 Institutional support: RVO:67985807 Keywords : complex networks * mutual information * entropy maximization * fMRI Subject RIV: BD - Theory of Information OBOR OECD: Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8) Impact factor: 4.259, year: 2016
Protein Inference from the Integration of Tandem MS Data and Interactome Networks.
Zhong, Jiancheng; Wang, Jianxing; Ding, Xiaojun; Zhang, Zhen; Li, Min; Wu, Fang-Xiang; Pan, Yi
2017-01-01
Since proteins are digested into a mixture of peptides in the preprocessing step of tandem mass spectrometry (MS), it is difficult to determine which specific protein a shared peptide belongs to. In recent studies, besides tandem MS data and peptide identification information, some other information is exploited to infer proteins. Different from the methods which first use only tandem MS data to infer proteins and then use network information to refine them, this study proposes a protein inference method named TMSIN, which uses interactome networks directly. As two interacting proteins should co-exist, it is reasonable to assume that if one of the interacting proteins is confidently inferred in a sample, its interacting partners should have a high probability in the same sample, too. Therefore, we can use the neighborhood information of a protein in an interactome network to adjust the probability that the shared peptide belongs to the protein. In TMSIN, a multi-weighted graph is constructed by incorporating the bipartite graph with interactome network information, where the bipartite graph is built with the peptide identification information. Based on multi-weighted graphs, TMSIN adopts an iterative workflow to infer proteins. At each iterative step, the probability that a shared peptide belongs to a specific protein is calculated by using the Bayes' law based on the neighbor protein support scores of each protein which are mapped by the shared peptides. We carried out experiments on yeast data and human data to evaluate the performance of TMSIN in terms of ROC, q-value, and accuracy. The experimental results show that AUC scores yielded by TMSIN are 0.742 and 0.874 in yeast dataset and human dataset, respectively, and TMSIN yields the maximum number of true positives when q-value less than or equal to 0.05. The overlap analysis shows that TMSIN is an effective complementary approach for protein inference.
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
2009-01-01
Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori
Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang
2011-01-01
An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717
Directory of Open Access Journals (Sweden)
Dejan Pecevski
2011-12-01
Full Text Available An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away" and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.
Directory of Open Access Journals (Sweden)
Frank Emmert-Streib
2013-02-01
Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.
Inferring cultural regions from correlation networks of given baby names
Pomorski, Mateusz; Krawczyk, Małgorzata J.; Kułakowski, Krzysztof; Kwapień, Jarosław; Ausloos, Marcel
2016-03-01
We report investigations on the statistical characteristics of the baby names given between 1910 and 2010 in the United States of America. For each year, the 100 most frequent names in the USA are sorted out. For these names, the correlations between the names profiles are calculated for all pairs of states (minus Hawaii and Alaska). The correlations are used to form a weighted network which is found to vary mildly in time. In fact, the structure of communities in the network remains quite stable till about 1980. The goal is that the calculated structure approximately reproduces the usually accepted geopolitical regions: the Northeast, the South, and the "Midwest + West" as the third one. Furthermore, the dataset reveals that the name distribution satisfies the Zipf law, separately for each state and each year, i.e. the name frequency f ∝r-α, where r is the name rank. Between 1920 and 1980, the exponent α is the largest one for the set of states classified as 'the South', but the smallest one for the set of states classified as "Midwest + West". Our interpretation is that the pool of selected names was quite narrow in the Southern states. The data is compared with some related statistics of names in Belgium, a country also with different regions, but having quite a different scale than the USA. There, the Zipf exponent is low for young people and for the Brussels citizens.
Cascading Failures in Networks: Inference, Intervention and Robustness to WMDs
2016-08-01
has had impact in the community: 4 “Clustering sparse graphs,” a paper from the first year of this project, has already...This project has had a huge positive impact on the development of talent in the mathematics of networks. Specifically...connectivity of cellphone networks under attack etc. 7 What do you
Directory of Open Access Journals (Sweden)
Xiaodong Cai
Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Inference of Transcription Regulatory Network in Low Phytic Acid Soybean Seeds
Directory of Open Access Journals (Sweden)
Neelam Redekar
2017-11-01
Full Text Available A dominant loss of function mutation in myo-inositol phosphate synthase (MIPS gene and recessive loss of function mutations in two multidrug resistant protein type-ABC transporter genes not only reduce the seed phytic acid levels in soybean, but also affect the pathways associated with seed development, ultimately resulting in low emergence. To understand the regulatory mechanisms and identify key genes that intervene in the seed development process in low phytic acid crops, we performed computational inference of gene regulatory networks in low and normal phytic acid soybeans using a time course transcriptomic data and multiple network inference algorithms. We identified a set of putative candidate transcription factors and their regulatory interactions with genes that have functions in myo-inositol biosynthesis, auxin-ABA signaling, and seed dormancy. We evaluated the performance of our unsupervised network inference method by comparing the predicted regulatory network with published regulatory interactions in Arabidopsis. Some contrasting regulatory interactions were observed in low phytic acid mutants compared to non-mutant lines. These findings provide important hypotheses on expression regulation of myo-inositol metabolism and phytohormone signaling in developing low phytic acid soybeans. The computational pipeline used for unsupervised network learning in this study is provided as open source software and is freely available at https://lilabatvt.github.io/LPANetwork/.
DEFF Research Database (Denmark)
Meng, Weizhi; Li, Wenjuan; Xiang, Yang
2017-01-01
and experience for both patients and healthcare workers, and the underlying network architecture to support such devices is also referred to as medical smartphone networks (MSNs). MSNs, similar to other networks, are subject to a wide range of attacks (e.g. leakage of sensitive patient information by a malicious...... insider). In this work, we focus on MSNs and present a compact but efficient trust-based approach using Bayesian inference to identify malicious nodes in such an environment. We then demonstrate the effectiveness of our approach in detecting malicious nodes by evaluating the deployment of our proposed...
Inference of Cancer-specific Gene Regulatory Networks Using Soft Computing Rules
Directory of Open Access Journals (Sweden)
Xiaosheng Wang
2010-03-01
Full Text Available Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner
2014-12-01
Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Reveal, A General Reverse Engineering Algorithm for Inference of Genetic Network Architectures
Liang, Shoudan; Fuhrman, Stefanie; Somogyi, Roland
1998-01-01
Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.
Inference of cancer-specific gene regulatory networks using soft computing rules.
Wang, Xiaosheng; Gotoh, Osamu
2010-03-24
Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.
Directory of Open Access Journals (Sweden)
Fei Xiao
Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas
2016-09-19
Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
2018-01-01
Signaling pathways represent parts of the global biological molecular network which connects them into a seamless whole through complex direct and indirect (hidden) crosstalk whose structure can change during development or in pathological conditions. We suggest a novel methodology, called Googlomics, for the structural analysis of directed biological networks using spectral analysis of their Google matrices, using parallels with quantum scattering theory, developed for nuclear and mesoscopic physics and quantum chaos. We introduce analytical “reduced Google matrix” method for the analysis of biological network structure. The method allows inferring hidden causal relations between the members of a signaling pathway or a functionally related group of genes. We investigate how the structure of hidden causal relations can be reprogrammed as a result of changes in the transcriptional network layer during cancerogenesis. The suggested Googlomics approach rigorously characterizes complex systemic changes in the wiring of large causal biological networks in a computationally efficient way. PMID:29370181
Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng
2018-02-01
Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Model-free inference of direct network interactions from nonlinear collective dynamics.
Casadiego, Jose; Nitzan, Mor; Hallerberg, Sarah; Timme, Marc
2017-12-19
The topology of interactions in network dynamical systems fundamentally underlies their function. Accelerating technological progress creates massively available data about collective nonlinear dynamics in physical, biological, and technological systems. Detecting direct interaction patterns from those dynamics still constitutes a major open problem. In particular, current nonlinear dynamics approaches mostly require to know a priori a model of the (often high dimensional) system dynamics. Here we develop a model-independent framework for inferring direct interactions solely from recording the nonlinear collective dynamics generated. Introducing an explicit dependency matrix in combination with a block-orthogonal regression algorithm, the approach works reliably across many dynamical regimes, including transient dynamics toward steady states, periodic and non-periodic dynamics, and chaos. Together with its capabilities to reveal network (two point) as well as hypernetwork (e.g., three point) interactions, this framework may thus open up nonlinear dynamics options of inferring direct interaction patterns across systems where no model is known.
Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.
2009-01-01
Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...
Directory of Open Access Journals (Sweden)
Xiaobo Guo
Full Text Available Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs. It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC curve and the precision-recall (PR curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Predictive minimum description length principle approach to inferring gene regulatory networks.
Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping
2011-01-01
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
Cross-Dependency Inference in Multi-Layered Networks: A Collaborative Filtering Perspective.
Chen, Chen; Tong, Hanghang; Xie, Lei; Ying, Lei; He, Qing
2017-08-01
The increasingly connected world has catalyzed the fusion of networks from different domains, which facilitates the emergence of a new network model-multi-layered networks. Examples of such kind of network systems include critical infrastructure networks, biological systems, organization-level collaborations, cross-platform e-commerce, and so forth. One crucial structure that distances multi-layered network from other network models is its cross-layer dependency, which describes the associations between the nodes from different layers. Needless to say, the cross-layer dependency in the network plays an essential role in many data mining applications like system robustness analysis and complex network control. However, it remains a daunting task to know the exact dependency relationships due to noise, limited accessibility, and so forth. In this article, we tackle the cross-layer dependency inference problem by modeling it as a collective collaborative filtering problem. Based on this idea, we propose an effective algorithm Fascinate that can reveal unobserved dependencies with linear complexity. Moreover, we derive Fascinate-ZERO, an online variant of Fascinate that can respond to a newly added node timely by checking its neighborhood dependencies. We perform extensive evaluations on real datasets to substantiate the superiority of our proposed approaches.
White, H; Racine, J
2001-01-01
We propose tests for individual and joint irrelevance of network inputs. Such tests can be used to determine whether an input or group of inputs "belong" in a particular model, thus permitting valid statistical inference based on estimated feedforward neural-network models. The approaches employ well-known statistical resampling techniques. We conduct a small Monte Carlo experiment showing that our tests have reasonable level and power behavior, and we apply our methods to examine whether there are predictable regularities in foreign exchange rates. We find that exchange rates do appear to contain information that is exploitable for enhanced point prediction, but the nature of the predictive relations evolves through time.
Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J
2015-03-07
Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP
Cytoprophet: a Cytoscape plug-in for protein and domain interaction networks inference.
Morcos, Faruck; Lamanna, Charles; Sikora, Marcin; Izaguirre, Jesús
2008-10-01
Cytoprophet is a software tool that allows prediction and visualization of protein and domain interaction networks. It is implemented as a plug-in of Cytoscape, an open source software framework for analysis and visualization of molecular networks. Cytoprophet implements three algorithms that predict new potential physical interactions using the domain composition of proteins and experimental assays. The algorithms for protein and domain interaction inference include maximum likelihood estimation (MLE) using expectation maximization (EM); the set cover approach maximum specificity set cover (MSSC) and the sum-product algorithm (SPA). After accepting an input set of proteins with Uniprot ID/Accession numbers and a selected prediction algorithm, Cytoprophet draws a network of potential interactions with probability scores and GO distances as edge attributes. A network of domain interactions between the domains of the initial protein list can also be generated. Cytoprophet was designed to take advantage of the visual capabilities of Cytoscape and be simple to use. An example of inference in a signaling network of myxobacterium Myxococcus xanthus is presented and available at Cytoprophet's website. http://cytoprophet.cse.nd.edu.
A Local Poisson Graphical Model for inferring networks from sequencing data.
Allen, Genevera I; Liu, Zhandong
2013-09-01
Gaussian graphical models, a class of undirected graphs or Markov Networks, are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies such as RNA-sequencing or next generation sequencing to measure gene expression. As the resulting data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for this discrete data. In this paper, we propose a novel method for inferring gene networks from sequencing data: the Local Poisson Graphical Model. Our model assumes a Local Markov property where each variable conditional on all other variables is Poisson distributed. We develop a neighborhood selection algorithm to fit our model locally by performing a series of l1 penalized Poisson, or log-linear, regressions. This yields a fast parallel algorithm for estimating networks from next generation sequencing data. In simulations, we illustrate the effectiveness of our methods for recovering network structure from count data. A case study on breast cancer microRNAs (miRNAs), a novel application of graphical models, finds known regulators of breast cancer genes and discovers novel miRNA clusters and hubs that are targets for future research.
Griffin: A Tool for Symbolic Inference of Synchronous Boolean Molecular Networks
Directory of Open Access Journals (Sweden)
Stalin Muñoz
2018-03-01
Full Text Available Boolean networks are important models of biochemical systems, located at the high end of the abstraction spectrum. A number of Boolean gene networks have been inferred following essentially the same method. Such a method first considers experimental data for a typically underdetermined “regulation” graph. Next, Boolean networks are inferred by using biological constraints to narrow the search space, such as a desired set of (fixed-point or cyclic attractors. We describe Griffin, a computer tool enhancing this method. Griffin incorporates a number of well-established algorithms, such as Dubrova and Teslenko's algorithm for finding attractors in synchronous Boolean networks. In addition, a formal definition of regulation allows Griffin to employ “symbolic” techniques, able to represent both large sets of network states and Boolean constraints. We observe that when the set of attractors is required to be an exact set, prohibiting additional attractors, a naive Boolean coding of this constraint may be unfeasible. Such cases may be intractable even with symbolic methods, as the number of Boolean constraints may be astronomically large. To overcome this problem, we employ an Artificial Intelligence technique known as “clause learning” considerably increasing Griffin's scalability. Without clause learning only toy examples prohibiting additional attractors are solvable: only one out of seven queries reported here is answered. With clause learning, by contrast, all seven queries are answered. We illustrate Griffin with three case studies drawn from the Arabidopsis thaliana literature. Griffin is available at: http://turing.iimas.unam.mx/griffin.
Matthews, Luke J.; DeWan, Peter; Rula, Elizabeth Y.
2013-01-01
Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1) the absolute number of emails exchanged, (2) logistic regression probability of an offline relationship, and (3) the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI) across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a social network. PMID
Matthews, Luke J; DeWan, Peter; Rula, Elizabeth Y
2013-01-01
Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1) the absolute number of emails exchanged, (2) logistic regression probability of an offline relationship, and (3) the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI) across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a social network.
Directory of Open Access Journals (Sweden)
Luke J Matthews
Full Text Available Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1 the absolute number of emails exchanged, (2 logistic regression probability of an offline relationship, and (3 the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a
Using neutral network to infer the hydrodynamic yield of aspherical sources
International Nuclear Information System (INIS)
Moran, B.; Glenn, L.A.
1993-07-01
We distinguish two kinds of difficulties with yield determination from aspherical sources. The first kind, the spoofing difficulty, occurs when a fraction of the energy of the explosion is channeled in such a way that it is not detected by the CORRTEX cable. In this case, neither neural networks nor any expert system can be expected to accurately estimate the yield without detailed information about device emplacement within the canister. Numerical simulations however, can provide an upper bound on the undetected fraction of the explosive energy. In the second instance, the interpretation difficulty, the data appear abnormal when analyzed using similar-explosion-scaling and the assumption of a spherical front. The inferred yield varies with time and the confidence in the yield estimate decreases. It is this kind of problem we address in this paper and for which neural networks can make a contribution. We used a back propagation neural network to infer the hydrodynamic yield of simulated aspherical sources. We trained the network using a subset of simulations from 3 different aspherical sources, with 3 different yields, and 3 satellite offset separations. The trained network was able to predict the yield within 15% in all cases and to identify the correct type of aspherical source in most cases. The predictive capability of the network increased with a larger training set. The neural network approach can easily incorporate information from new calculations or experiments and is therefore flexible and easy to maintain. We describe the potential capabilities and limitations in using such networks for yield estimations
Using neural networks to infer the hydrodynamic yield of aspherical sources
International Nuclear Information System (INIS)
Moran, B.; Glenn, L.
1993-01-01
We distinguish two kinds of difficulties with yield determination from aspherical sources. The first kind, the spoofing difficulty, occurs when a fraction of the energy of the explosion is channeled in such a way that it is not detected by the CORRTEX cable. In this case, neither neural networks nor any expert system can be expected to accurately estimate the yield without detailed information about device emplacement within the canister. Numerical simulations however, can provide an upper bound on the undetected fraction of the explosive energy. In the second instance, the interpretation difficulty, the data appear abnormal when analyzed using similar-explosion-scaling and the assumption of a spherical front. The inferred yield varies with time and the confidence in the yield estimate decreases. It is this kind of problem we address in this paper and for which neural networks can make a contribution. We used a back propagation neural network to infer the hydrodynamic yield of simulated aspherical sources. We trained the network using a subset of simulations from 3 different aspherical sources, with 3 different yield, and 3 satellite offset separations. The trained network was able to predict the yield within 15% in all cases and to identify the correct type of aspherical source in most cases. The predictive capability of the network increased with a larger training set. The neural network approach can easily incorporate information from new calculations or experiments and is therefore flexible and easy to maintain. We describe the potential capabilities and limitations in using such networks for yield estimations
A novel gene network inference algorithm using predictive minimum description length approach.
Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang
2010-05-28
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the
Evaluating the Limits of Network Topology Inference Via Virtualized Network Emulation
2015-06-01
virtualized environment. First, we automatically build topological ground truth according to various network generation models and create emulated Cisco ...to various network generation models and create emulated Cisco router networks by leveraging and modifying existing emulation software. We then au... markets , to verifying compliance with policy, as in recent “network neutrality” rules established in the United States. The Internet is a network of
Inference and Analysis of Population Structure Using Genetic Data and Network Theory.
Greenbaum, Gili; Templeton, Alan R; Bar-David, Shirli
2016-04-01
Clustering individuals to subpopulations based on genetic data has become commonplace in many genetic studies. Inference about population structure is most often done by applying model-based approaches, aided by visualization using distance-based approaches such as multidimensional scaling. While existing distance-based approaches suffer from a lack of statistical rigor, model-based approaches entail assumptions of prior conditions such as that the subpopulations are at Hardy-Weinberg equilibria. Here we present a distance-based approach for inference about population structure using genetic data by defining population structure using network theory terminology and methods. A network is constructed from a pairwise genetic-similarity matrix of all sampled individuals. The community partition, a partition of a network to dense subgraphs, is equated with population structure, a partition of the population to genetically related groups. Community-detection algorithms are used to partition the network into communities, interpreted as a partition of the population to subpopulations. The statistical significance of the structure can be estimated by using permutation tests to evaluate the significance of the partition's modularity, a network theory measure indicating the quality of community partitions. To further characterize population structure, a new measure of the strength of association (SA) for an individual to its assigned community is presented. The strength of association distribution (SAD) of the communities is analyzed to provide additional population structure characteristics, such as the relative amount of gene flow experienced by the different subpopulations and identification of hybrid individuals. Human genetic data and simulations are used to demonstrate the applicability of the analyses. The approach presented here provides a novel, computationally efficient model-free method for inference about population structure that does not entail assumption of
A comparative study of covariance selection models for the inference of gene regulatory networks.
Stifanelli, Patrizia F; Creanza, Teresa M; Anglani, Roberto; Liuzzi, Vania C; Mukherjee, Sayan; Schena, Francesco P; Ancona, Nicola
2013-10-01
The inference, or 'reverse-engineering', of gene regulatory networks from expression data and the description of the complex dependency structures among genes are open issues in modern molecular biology. In this paper we compared three regularized methods of covariance selection for the inference of gene regulatory networks, developed to circumvent the problems raising when the number of observations n is smaller than the number of genes p. The examined approaches provided three alternative estimates of the inverse covariance matrix: (a) the 'PINV' method is based on the Moore-Penrose pseudoinverse, (b) the 'RCM' method performs correlation between regression residuals and (c) 'ℓ(2C)' method maximizes a properly regularized log-likelihood function. Our extensive simulation studies showed that ℓ(2C) outperformed the other two methods having the most predictive partial correlation estimates and the highest values of sensitivity to infer conditional dependencies between genes even when a few number of observations was available. The application of this method for inferring gene networks of the isoprenoid biosynthesis pathways in Arabidopsis thaliana allowed to enlighten a negative partial correlation coefficient between the two hubs in the two isoprenoid pathways and, more importantly, provided an evidence of cross-talk between genes in the plastidial and the cytosolic pathways. When applied to gene expression data relative to a signature of HRAS oncogene in human cell cultures, the method revealed 9 genes (p-value<0.0005) directly interacting with HRAS, sharing the same Ras-responsive binding site for the transcription factor RREB1. This result suggests that the transcriptional activation of these genes is mediated by a common transcription factor downstream of Ras signaling. Software implementing the methods in the form of Matlab scripts are available at: http://users.ba.cnr.it/issia/iesina18/CovSelModelsCodes.zip. Copyright © 2013 The Authors. Published by
Directory of Open Access Journals (Sweden)
Xinbo Ai
2014-11-01
Full Text Available Topological measures are crucial to describe, classify and understand complex networks. Lots of measures are proposed to characterize specific features of specific networks, but the relationships among these measures remain unclear. Taking into account that pulling networks from different domains together for statistical analysis might provide incorrect conclusions, we conduct our investigation with data observed from the same network in the form of simultaneously measured time series. We synthesize a transfer entropy-based framework to quantify the relationships among topological measures, and then to provide a holistic scenario of these measures by inferring a drive-response network. Techniques from Symbolic Transfer Entropy, Effective Transfer Entropy, and Partial Transfer Entropy are synthesized to deal with challenges such as time series being non-stationary, finite sample effects and indirect effects. We resort to kernel density estimation to assess significance of the results based on surrogate data. The framework is applied to study 20 measures across 2779 records in the Technology Exchange Network, and the results are consistent with some existing knowledge. With the drive-response network, we evaluate the influence of each measure by calculating its strength, and cluster them into three classes, i.e., driving measures, responding measures and standalone measures, according to the network communities.
Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana
Muraro, D.
2013-01-01
Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation. © 2004-2012 IEEE.
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Takemoto, Kazuhiro; Aie, Kazuki
2017-05-25
Host-pathogen interactions are important in a wide range of research fields. Given the importance of metabolic crosstalk between hosts and pathogens, a metabolic network-based reverse ecology method was proposed to infer these interactions. However, the validity of this method remains unclear because of the various explanations presented and the influence of potentially confounding factors that have thus far been neglected. We re-evaluated the importance of the reverse ecology method for evaluating host-pathogen interactions while statistically controlling for confounding effects using oxygen requirement, genome, metabolic network, and phylogeny data. Our data analyses showed that host-pathogen interactions were more strongly influenced by genome size, primary network parameters (e.g., number of edges), oxygen requirement, and phylogeny than the reserve ecology-based measures. These results indicate the limitations of the reverse ecology method; however, they do not discount the importance of adopting reverse ecology approaches altogether. Rather, we highlight the need for developing more suitable methods for inferring host-pathogen interactions and conducting more careful examinations of the relationships between metabolic networks and host-pathogen interactions.
Opinion Dynamics on Networks with Inference of Unobservable States of Others
Fujie, Ryo
In most opinion formation models which have been proposed, the agents decide their states (i.e. opinions) by referring to the states of others. However, the referred states of others are not necessarily observable and may be inferred. To investigate the effect of an inference of the states of others on opinion dynamics, I propose an extended voter model on networks where observable and referable node sets are different. These sets for a node defined as the nearest to the mo-th neighbors for observable nodes and the nearest to the mr-th neighbors for referable nodes. The state of referable but unobservable node which is the m-th neighbor (mo pagerank'' is conserved. This conserved quantity coincides with the fixation probability. On the other hand, in the case of mo =mr = 1 , the model comes down to the standard voter model on networks and the conserved quantity is a degree-weighted superposition of the states. Thus, the introduction of the inference changes the important opinion spreaders from the high-degree nodes to the high-betweenness pagerank nodes. This work is supported by the Collaboration Research Program of IDEAS, Chubu University IDEAS2016233.
Directory of Open Access Journals (Sweden)
Daniel Lobo
2015-06-01
Full Text Available Transformative applications in biomedicine require the discovery of complex regulatory networks that explain the development and regeneration of anatomical structures, and reveal what external signals will trigger desired changes of large-scale pattern. Despite recent advances in bioinformatics, extracting mechanistic pathway models from experimental morphological data is a key open challenge that has resisted automation. The fundamental difficulty of manually predicting emergent behavior of even simple networks has limited the models invented by human scientists to pathway diagrams that show necessary subunit interactions but do not reveal the dynamics that are sufficient for complex, self-regulating pattern to emerge. To finally bridge the gap between high-resolution genetic data and the ability to understand and control patterning, it is critical to develop computational tools to efficiently extract regulatory pathways from the resultant experimental shape phenotypes. For example, planarian regeneration has been studied for over a century, but despite increasing insight into the pathways that control its stem cells, no constructive, mechanistic model has yet been found by human scientists that explains more than one or two key features of its remarkable ability to regenerate its correct anatomical pattern after drastic perturbations. We present a method to infer the molecular products, topology, and spatial and temporal non-linear dynamics of regulatory networks recapitulating in silico the rich dataset of morphological phenotypes resulting from genetic, surgical, and pharmacological experiments. We demonstrated our approach by inferring complete regulatory networks explaining the outcomes of the main functional regeneration experiments in the planarian literature; By analyzing all the datasets together, our system inferred the first systems-biology comprehensive dynamical model explaining patterning in planarian regeneration. This method
Inferring regulatory networks from expression data using tree-based methods.
Directory of Open Access Journals (Sweden)
Vân Anh Huynh-Thu
2010-09-01
Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
Directory of Open Access Journals (Sweden)
Yi-Jen Mon
2012-10-01
Full Text Available A supervisory Adaptive Network-based Fuzzy Inference System (SANFIS is proposed for the empirical control of a mobile robot. This controller includes an ANFIS controller and a supervisory controller. The ANFIS controller is off-line tuned by an adaptive fuzzy inference system, the supervisory controller is designed to compensate for the approximation error between the ANFIS controller and the ideal controller, and drive the trajectory of the system onto a specified surface (called the sliding surface or switching surface while maintaining the trajectory onto this switching surface continuously to guarantee the system stability. This SANFIS controller can achieve favourable empirical control performance of the mobile robot in the empirical tests of driving the mobile robot with a square path. Practical experimental results demonstrate that the proposed SANFIS can achieve better control performance than that achieved using an ANFIS controller for empirical control of the mobile robot.
State of the Art of Fuzzy Methods for Gene Regulatory Networks Inference
Directory of Open Access Journals (Sweden)
Tuqyah Abdullah Al Qazlan
2015-01-01
Full Text Available To address one of the most challenging issues at the cellular level, this paper surveys the fuzzy methods used in gene regulatory networks (GRNs inference. GRNs represent causal relationships between genes that have a direct influence, trough protein production, on the life and the development of living organisms and provide a useful contribution to the understanding of the cellular functions as well as the mechanisms of diseases. Fuzzy systems are based on handling imprecise knowledge, such as biological information. They provide viable computational tools for inferring GRNs from gene expression data, thus contributing to the discovery of gene interactions responsible for specific diseases and/or ad hoc correcting therapies. Increasing computational power and high throughput technologies have provided powerful means to manage these challenging digital ecosystems at different levels from cell to society globally. The main aim of this paper is to report, present, and discuss the main contributions of this multidisciplinary field in a coherent and structured framework.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Directory of Open Access Journals (Sweden)
Benjamin W. Y. Lo
2013-01-01
Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Using Regularization to Infer Cell Line Specificity in Logical Network Models of Signaling Pathways
Directory of Open Access Journals (Sweden)
Sébastien De Landtsheer
2018-05-01
Full Text Available Understanding the functional properties of cells of different origins is a long-standing challenge of personalized medicine. Especially in cancer, the high heterogeneity observed in patients slows down the development of effective cures. The molecular differences between cell types or between healthy and diseased cellular states are usually determined by the wiring of regulatory networks. Understanding these molecular and cellular differences at the systems level would improve patient stratification and facilitate the design of rational intervention strategies. Models of cellular regulatory networks frequently make weak assumptions about the distribution of model parameters across cell types or patients. These assumptions are usually expressed in the form of regularization of the objective function of the optimization problem. We propose a new method of regularization for network models of signaling pathways based on the local density of the inferred parameter values within the parameter space. Our method reduces the complexity of models by creating groups of cell line-specific parameters which can then be optimized together. We demonstrate the use of our method by recovering the correct topology and inferring accurate values of the parameters of a small synthetic model. To show the value of our method in a realistic setting, we re-analyze a recently published phosphoproteomic dataset from a panel of 14 colon cancer cell lines. We conclude that our method efficiently reduces model complexity and helps recovering context-specific regulatory information.
Fast Inference of Deep Neural Networks in FPGAs for Particle Physics
Energy Technology Data Exchange (ETDEWEB)
Duarte, Javier [Fermilab; Han, Song [MIT; Harris, Philip [MIT; Jindariani, Sergo [Fermilab; Kreinar, Edward [EIS Intl., Herndon; Kreis, Benjamin [Fermilab; Ngadiuba, Jennifer [CERN; Pierini, Maurizio [CERN; Rivera, Ryan [Fermilab; Tran, Nhan [Fermilab; Wu, Zhenbin [Illinois U., Chicago
2018-04-16
Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.
Van Landeghem, Sofie; Van Parys, Thomas; Dubois, Marieke; Inzé, Dirk; Van de Peer, Yves
2016-01-05
Differential networks have recently been introduced as a powerful way to study the dynamic rewiring capabilities of an interactome in response to changing environmental conditions or stimuli. Currently, such differential networks are generated and visualised using ad hoc methods, and are often limited to the analysis of only one condition-specific response or one interaction type at a time. In this work, we present a generic, ontology-driven framework to infer, visualise and analyse an arbitrary set of condition-specific responses against one reference network. To this end, we have implemented novel ontology-based algorithms that can process highly heterogeneous networks, accounting for both physical interactions and regulatory associations, symmetric and directed edges, edge weights and negation. We propose this integrative framework as a standardised methodology that allows a unified view on differential networks and promotes comparability between differential network studies. As an illustrative application, we demonstrate its usefulness on a plant abiotic stress study and we experimentally confirmed a predicted regulator. Diffany is freely available as open-source java library and Cytoscape plugin from http://bioinformatics.psb.ugent.be/supplementary_data/solan/diffany/.
An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks
Bayer, Christian
2016-01-06
In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem of approximating the reaction coefficients based on discretely observed data. To this end, we introduce an efficient two-phase algorithm in which the first phase is deterministic and it is intended to provide a starting point for the second phase which is the Monte Carlo EM Algorithm.
Directory of Open Access Journals (Sweden)
Amol P. Bhondekar
2010-03-01
Full Text Available Sensor deployment scheme highly governs the effectiveness of distributed wireless sensor network. Issues such as energy conservation and clustering make the deployment problem much more complex. A multiobjective Fuzzy Inference System based strategy for mobile sensor deployment is presented in this paper. This strategy gives a synergistic combination of energy capacity, clustering and peer-to-peer deployment. Performance of our strategy is evaluated in terms of coverage, uniformity, speed and clustering. Our algorithm is compared against a modified distributed self-spreading algorithm to exhibit better performance.
Directory of Open Access Journals (Sweden)
Ingkasuwan Papapit
2012-08-01
Full Text Available Abstract Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM. Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF. A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090, which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene. The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070 and constans-like (COL: At2g21320, were identified as positive regulators of starch synthase 4 (SS4: At4g18240. The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray
Network-assisted crop systems genetics: network inference and integrative analysis.
Lee, Tak; Kim, Hyojin; Lee, Insuk
2015-04-01
Although next-generation sequencing (NGS) technology has enabled the decoding of many crop species genomes, most of the underlying genetic components for economically important crop traits remain to be determined. Network approaches have proven useful for the study of the reference plant, Arabidopsis thaliana, and the success of network-based crop genetics will also require the availability of a genome-scale functional networks for crop species. In this review, we discuss how to construct functional networks and elucidate the holistic view of a crop system. The crop gene network then can be used for gene prioritization and the analysis of resequencing-based genome-wide association study (GWAS) data, the amount of which will rapidly grow in the field of crop science in the coming years. Copyright © 2015 Elsevier Ltd. All rights reserved.
In silico model-based inference: a contemporary approach for hypothesis testing in network biology.
Klinke, David J
2014-01-01
Inductive inference plays a central role in the study of biological systems where one aims to increase their understanding of the system by reasoning backwards from uncertain observations to identify causal relationships among components of the system. These causal relationships are postulated from prior knowledge as a hypothesis or simply a model. Experiments are designed to test the model. Inferential statistics are used to establish a level of confidence in how well our postulated model explains the acquired data. This iterative process, commonly referred to as the scientific method, either improves our confidence in a model or suggests that we revisit our prior knowledge to develop a new model. Advances in technology impact how we use prior knowledge and data to formulate models of biological networks and how we observe cellular behavior. However, the approach for model-based inference has remained largely unchanged since Fisher, Neyman and Pearson developed the ideas in the early 1900s that gave rise to what is now known as classical statistical hypothesis (model) testing. Here, I will summarize conventional methods for model-based inference and suggest a contemporary approach to aid in our quest to discover how cells dynamically interpret and transmit information for therapeutic aims that integrates ideas drawn from high performance computing, Bayesian statistics, and chemical kinetics. © 2014 American Institute of Chemical Engineers.
Inferring animal social networks and leadership: applications for passive monitoring arrays.
Jacoby, David M P; Papastamatiou, Yannis P; Freeman, Robin
2016-11-01
Analyses of animal social networks have frequently benefited from techniques derived from other disciplines. Recently, machine learning algorithms have been adopted to infer social associations from time-series data gathered using remote, telemetry systems situated at provisioning sites. We adapt and modify existing inference methods to reveal the underlying social structure of wide-ranging marine predators moving through spatial arrays of passive acoustic receivers. From six months of tracking data for grey reef sharks (Carcharhinus amblyrhynchos) at Palmyra atoll in the Pacific Ocean, we demonstrate that some individuals emerge as leaders within the population and that this behavioural coordination is predicted by both sex and the duration of co-occurrences between conspecifics. In doing so, we provide the first evidence of long-term, spatially extensive social processes in wild sharks. To achieve these results, we interrogate simulated and real tracking data with the explicit purpose of drawing attention to the key considerations in the use and interpretation of inference methods and their impact on resultant social structure. We provide a modified translation of the GMMEvents method for R, including new analyses quantifying the directionality and duration of social events with the aim of encouraging the careful use of these methods more widely in less tractable social animal systems but where passive telemetry is already widespread. © 2016 The Authors.
Directory of Open Access Journals (Sweden)
Nandkumar Wagh
2014-01-01
Full Text Available Continuity of power supply is of utmost importance to the consumers and is only possible by coordination and reliable operation of power system components. Power transformer is such a prime equipment of the transmission and distribution system and needs to be continuously monitored for its well-being. Since ratio methods cannot provide correct diagnosis due to the borderline problems and the probability of existence of multiple faults, artificial intelligence could be the best approach. Dissolved gas analysis (DGA interpretation may provide an insight into the developing incipient faults and is adopted as the preliminary diagnosis tool. In the proposed work, a comparison of the diagnosis ability of backpropagation (BP, radial basis function (RBF neural network, and adaptive neurofuzzy inference system (ANFIS has been investigated and the diagnosis results in terms of error measure, accuracy, network training time, and number of iterations are presented.
Using adaptive network based fuzzy inference system to forecast regional electricity loads
International Nuclear Information System (INIS)
Ying, L.-C.; Pan, M.-C.
2008-01-01
Since accurate regional load forecasting is very important for improvement of the management performance of the electric industry, various regional load forecasting methods have been developed. The purpose of this study is to apply the adaptive network based fuzzy inference system (ANFIS) model to forecast the regional electricity loads in Taiwan and demonstrate the forecasting performance of this model. Based on the mean absolute percentage errors and statistical results, we can see that the ANFIS model has better forecasting performance than the regression model, artificial neural network (ANN) model, support vector machines with genetic algorithms (SVMG) model, recurrent support vector machines with genetic algorithms (RSVMG) model and hybrid ellipsoidal fuzzy systems for time series forecasting (HEFST) model. Thus, the ANFIS model is a promising alternative for forecasting regional electricity loads
Using adaptive network based fuzzy inference system to forecast regional electricity loads
Energy Technology Data Exchange (ETDEWEB)
Ying, Li-Chih [Department of Marketing Management, Central Taiwan University of Science and Technology, 11, Pu-tzu Lane, Peitun, Taichung City 406 (China); Pan, Mei-Chiu [Graduate Institute of Management Sciences, Nanhua University, 32, Chung Keng Li, Dalin, Chiayi 622 (China)
2008-02-15
Since accurate regional load forecasting is very important for improvement of the management performance of the electric industry, various regional load forecasting methods have been developed. The purpose of this study is to apply the adaptive network based fuzzy inference system (ANFIS) model to forecast the regional electricity loads in Taiwan and demonstrate the forecasting performance of this model. Based on the mean absolute percentage errors and statistical results, we can see that the ANFIS model has better forecasting performance than the regression model, artificial neural network (ANN) model, support vector machines with genetic algorithms (SVMG) model, recurrent support vector machines with genetic algorithms (RSVMG) model and hybrid ellipsoidal fuzzy systems for time series forecasting (HEFST) model. Thus, the ANFIS model is a promising alternative for forecasting regional electricity loads. (author)
Suarez Diez, M.; Saccenti, E.
2015-01-01
We investigated the effect of sample size and dimensionality on the performance of four algorithms (ARACNE, CLR, CORR, and PCLRC) when they are used for the inference of metabolite association networks. We report that as many as 100-400 samples may be necessary to obtain stable network estimations,
A graphical user interface for a method to infer kinetics and network architecture (MIKANA).
Mourão, Márcio A; Srividhya, Jeyaraman; McSharry, Patrick E; Crampin, Edmund J; Schnell, Santiago
2011-01-01
One of the main challenges in the biomedical sciences is the determination of reaction mechanisms that constitute a biochemical pathway. During the last decades, advances have been made in building complex diagrams showing the static interactions of proteins. The challenge for systems biologists is to build realistic models of the dynamical behavior of reactants, intermediates and products. For this purpose, several methods have been recently proposed to deduce the reaction mechanisms or to estimate the kinetic parameters of the elementary reactions that constitute the pathway. One such method is MIKANA: Method to Infer Kinetics And Network Architecture. MIKANA is a computational method to infer both reaction mechanisms and estimate the kinetic parameters of biochemical pathways from time course data. To make it available to the scientific community, we developed a Graphical User Interface (GUI) for MIKANA. Among other features, the GUI validates and processes an input time course data, displays the inferred reactions, generates the differential equations for the chemical species in the pathway and plots the prediction curves on top of the input time course data. We also added a new feature to MIKANA that allows the user to exclude a priori known reactions from the inferred mechanism. This addition improves the performance of the method. In this article, we illustrate the GUI for MIKANA with three examples: an irreversible Michaelis-Menten reaction mechanism; the interaction map of chemical species of the muscle glycolytic pathway; and the glycolytic pathway of Lactococcus lactis. We also describe the code and methods in sufficient detail to allow researchers to further develop the code or reproduce the experiments described. The code for MIKANA is open source, free for academic and non-academic use and is available for download (Information S1).
Moraes, Alvaro
2015-01-01
Epidemics have shaped, sometimes more than wars and natural disasters, demo- graphic aspects of human populations around the world, their health habits and their economies. Ebola and the Middle East Respiratory Syndrome (MERS) are clear and current examples of potential hazards at planetary scale. During the spread of an epidemic disease, there are phenomena, like the sudden extinction of the epidemic, that can not be captured by deterministic models. As a consequence, stochastic models have been proposed during the last decades. A typical forward problem in the stochastic setting could be the approximation of the expected number of infected individuals found in one month from now. On the other hand, a typical inverse problem could be, given a discretely observed set of epidemiological data, infer the transmission rate of the epidemic or its basic reproduction number. Markovian epidemic models are stochastic models belonging to a wide class of pure jump processes known as Stochastic Reaction Networks (SRNs), that are intended to describe the time evolution of interacting particle systems where one particle interacts with the others through a finite set of reaction channels. SRNs have been mainly developed to model biochemical reactions but they also have applications in neural networks, virus kinetics, and dynamics of social networks, among others. 4 This PhD thesis is focused on novel fast simulation algorithms and statistical inference methods for SRNs. Our novel Multi-level Monte Carlo (MLMC) hybrid simulation algorithms provide accurate estimates of expected values of a given observable of SRNs at a prescribed final time. They are designed to control the global approximation error up to a user-selected accuracy and up to a certain confidence level, and with near optimal computational work. We also present novel dual-weighted residual expansions for fast estimation of weak and strong errors arising from the MLMC methodology. Regarding the statistical inference
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Directory of Open Access Journals (Sweden)
Xiangyun Xiao
Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen
2015-01-01
The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Inference of topology and the nature of synapses, and the flow of information in neuronal networks
Borges, F. S.; Lameu, E. L.; Iarosz, K. C.; Protachevicz, P. R.; Caldas, I. L.; Viana, R. L.; Macau, E. E. N.; Batista, A. M.; Baptista, M. S.
2018-02-01
The characterization of neuronal connectivity is one of the most important matters in neuroscience. In this work, we show that a recently proposed informational quantity, the causal mutual information, employed with an appropriate methodology, can be used not only to correctly infer the direction of the underlying physical synapses, but also to identify their excitatory or inhibitory nature, considering easy to handle and measure bivariate time series. The success of our approach relies on a surprising property found in neuronal networks by which nonadjacent neurons do "understand" each other (positive mutual information), however, this exchange of information is not capable of causing effect (zero transfer entropy). Remarkably, inhibitory connections, responsible for enhancing synchronization, transfer more information than excitatory connections, known to enhance entropy in the network. We also demonstrate that our methodology can be used to correctly infer directionality of synapses even in the presence of dynamic and observational Gaussian noise, and is also successful in providing the effective directionality of intermodular connectivity, when only mean fields can be measured.
Lachmann, Alexander; Giorgi, Federico M; Lopez, Gonzalo; Califano, Andrea
2016-07-15
The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space. Here, we present a completely new implementation of the algorithm, based on an Adaptive Partitioning strategy (AP) for estimating the Mutual Information. The new AP implementation (ARACNe-AP) achieves a dramatic improvement in computational performance (200× on average) over the previous methodology, while preserving the Mutual Information estimator and the Network inference accuracy of the original algorithm. Given that the previous version of ARACNe is extremely demanding, the new version of the algorithm will allow even researchers with modest computational resources to build complex regulatory networks from hundreds of gene expression profiles. A JAVA cross-platform command line executable of ARACNe, together with all source code and a detailed usage guide are freely available on Sourceforge (http://sourceforge.net/projects/aracne-ap). JAVA version 8 or higher is required. califano@c2b2.columbia.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
LASSIM-A network inference toolbox for genome-wide mechanistic modeling.
Directory of Open Access Journals (Sweden)
Rasmus Magnusson
2017-06-01
Full Text Available Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM, which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE for gene regulatory networks (GRNs. LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models
Inference of directed climate networks: role of instability of causality estimation methods
Hlinka, Jaroslav; Hartman, David; Vejmelka, Martin; Paluš, Milan
2013-04-01
Climate data are increasingly analyzed by complex network analysis methods, including graph-theoretical approaches [1]. For such analysis, links between localized nodes of climate network are typically quantified by some statistical measures of dependence (connectivity) between measured variables of interest. To obtain information on the directionality of the interactions in the networks, a wide range of methods exists. These can be broadly divided into linear and nonlinear methods, with some of the latter having the theoretical advantage of being model-free, and principally a generalization of the former [2]. However, as a trade-off, this generality comes together with lower accuracy - in particular if the system was close to linear. In an overall stationary system, this may potentially lead to higher variability in the nonlinear network estimates. Therefore, with the same control of false alarms, this may lead to lower sensitivity for detection of real changes in the network structure. These problems are discussed on the example of daily SAT and SLP data from the NCEP/NCAR reanalysis dataset. We first reduce the dimensionality of data using PCA with VARIMAX rotation to detect several dozens of components that together explain most of the data variability. We further construct directed climate networks applying a selection of most widely used methods - variants of linear Granger causality and conditional mutual information. Finally, we assess the stability of the detected directed climate networks by computing them in sliding time windows. To understand the origin of the observed instabilities and their range, we also apply the same procedure to two types of surrogate data: either with non-stationarity in network structure removed, or imposed in a controlled way. In general, the linear methods show stable results in terms of overall similarity of directed climate networks inferred. For instance, for different decades of SAT data, the Spearman correlation of edge
Challenges to inferring causality from viral information dispersion in dynamic social networks
Ternovski, John
2014-06-01
Understanding the mechanism behind large-scale information dispersion through complex networks has important implications for a variety of industries ranging from cyber-security to public health. With the unprecedented availability of public data from online social networks (OSNs) and the low cost nature of most OSN outreach, randomized controlled experiments, the "gold standard" of causal inference methodologies, have been used with increasing regularity to study viral information dispersion. And while these studies have dramatically furthered our understanding of how information disseminates through social networks by isolating causal mechanisms, there are still major methodological concerns that need to be addressed in future research. This paper delineates why modern OSNs are markedly different from traditional sociological social networks and why these differences present unique challenges to experimentalists and data scientists. The dynamic nature of OSNs is particularly troublesome for researchers implementing experimental designs, so this paper identifies major sources of bias arising from network mutability and suggests strategies to circumvent and adjust for these biases. This paper also discusses the practical considerations of data quality and collection, which may adversely impact the efficiency of the estimator. The major experimental methodologies used in the current literature on virality are assessed at length, and their strengths and limits identified. Other, as-yetunsolved threats to the efficiency and unbiasedness of causal estimators--such as missing data--are also discussed. This paper integrates methodologies and learnings from a variety of fields under an experimental and data science framework in order to systematically consolidate and identify current methodological limitations of randomized controlled experiments conducted in OSNs.
Trébulle, Pauline; Nicaud, Jean-Marc; Leplat, Christophe; Elati, Mohamed
2017-01-01
Complex phenotypes, such as lipid accumulation, result from cooperativity between regulators and the integration of multiscale information. However, the elucidation of such regulatory programs by experimental approaches may be challenging, particularly in context-specific conditions. In particular, we know very little about the regulators of lipid accumulation in the oleaginous yeast of industrial interest Yarrowia lipolytica . This lack of knowledge limits the development of this yeast as an industrial platform, due to the time-consuming and costly laboratory efforts required to design strains with the desired phenotypes. In this study, we aimed to identify context-specific regulators and mechanisms, to guide explorations of the regulation of lipid accumulation in Y. lipolytica . Using gene regulatory network inference, and considering the expression of 6539 genes over 26 time points from GSE35447 for biolipid production and a list of 151 transcription factors, we reconstructed a gene regulatory network comprising 111 transcription factors, 4451 target genes and 17048 regulatory interactions (YL-GRN-1) supported by evidence of protein-protein interactions. This study, based on network interrogation and wet laboratory validation (a) highlights the relevance of our proposed measure, the transcription factors influence, for identifying phases corresponding to changes in physiological state without prior knowledge (b) suggests new potential regulators and drivers of lipid accumulation and (c) experimentally validates the impact of six of the nine regulators identified on lipid accumulation, with variations in lipid content from +43.2% to -31.2% on glucose or glycerol.
Schmit, C. J.; Pritchard, J. R.
2018-03-01
Next generation radio experiments such as LOFAR, HERA, and SKA are expected to probe the Epoch of Reionization (EoR) and claim a first direct detection of the cosmic 21cm signal within the next decade. Data volumes will be enormous and can thus potentially revolutionize our understanding of the early Universe and galaxy formation. However, numerical modelling of the EoR can be prohibitively expensive for Bayesian parameter inference and how to optimally extract information from incoming data is currently unclear. Emulation techniques for fast model evaluations have recently been proposed as a way to bypass costly simulations. We consider the use of artificial neural networks as a blind emulation technique. We study the impact of training duration and training set size on the quality of the network prediction and the resulting best-fitting values of a parameter search. A direct comparison is drawn between our emulation technique and an equivalent analysis using 21CMMC. We find good predictive capabilities of our network using training sets of as low as 100 model evaluations, which is within the capabilities of fully numerical radiative transfer codes.
Energy Technology Data Exchange (ETDEWEB)
Karri, Vishy; Ho, Tien [School of Engineering, University of Tasmania, GPO Box 252-65, Hobart, Tasmania 7001 (Australia); Madsen, Ole [Department of Production, Aalborg University, Fibigerstraede 16, DK-9220 Aalborg (Denmark)
2008-06-15
Hydrogen is increasingly investigated as an alternative fuel to petroleum products in running internal combustion engines and as powering remote area power systems using generators. The safety issues related to hydrogen gas are further exasperated by expensive instrumentation required to measure the percentage of explosive limits, flow rates and production pressure. This paper investigates the use of model based virtual sensors (rather than expensive physical sensors) in connection with hydrogen production with a Hogen 20 electrolyzer system. The virtual sensors are used to predict relevant hydrogen safety parameters, such as the percentage of lower explosive limit, hydrogen pressure and hydrogen flow rate as a function of different input conditions of power supplied (voltage and current), the feed of de-ionized water and Hogen 20 electrolyzer system parameters. The virtual sensors are developed by means of the application of various Artificial Intelligent techniques. To train and appraise the neural network models as virtual sensors, the Hogen 20 electrolyzer is instrumented with necessary sensors to gather experimental data which together with MATLAB neural networks toolbox and tailor made adaptive neuro-fuzzy inference systems (ANFIS) were used as predictive tools to estimate hydrogen safety parameters. It was shown that using the neural networks hydrogen safety parameters were predicted to less than 3% of percentage average root mean square error. The most accurate prediction was achieved by using ANFIS. (author)
Nonlinear Inference in Partially Observed Physical Systems and Deep Neural Networks
Rozdeba, Paul J.
The problem of model state and parameter estimation is a significant challenge in nonlinear systems. Due to practical considerations of experimental design, it is often the case that physical systems are partially observed, meaning that data is only available for a subset of the degrees of freedom required to fully model the observed system's behaviors and, ultimately, predict future observations. Estimation in this context is highly complicated by the presence of chaos, stochasticity, and measurement noise in dynamical systems. One of the aims of this dissertation is to simultaneously analyze state and parameter estimation in as a regularized inverse problem, where the introduction of a model makes it possible to reverse the forward problem of partial, noisy observation; and as a statistical inference problem using data assimilation to transfer information from measurements to the model states and parameters. Ultimately these two formulations achieve the same goal. Similar aspects that appear in both are highlighted as a means for better understanding the structure of the nonlinear inference problem. An alternative approach to data assimilation that uses model reduction is then examined as a way to eliminate unresolved nonlinear gating variables from neuron models. In this formulation, only measured variables enter into the model, and the resulting errors are themselves modeled by nonlinear stochastic processes with memory. Finally, variational annealing, a data assimilation method previously applied to dynamical systems, is introduced as a potentially useful tool for understanding deep neural network training in machine learning by exploiting similarities between the two problems.
Bayer, Christian
2016-02-20
© 2016 Taylor & Francis Group, LLC. ABSTRACT: In this work, we present an extension of the forward–reverse representation introduced by Bayer and Schoenmakers (Annals of Applied Probability, 24(5):1994–2032, 2014) to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, that is, SRNs conditional on their values in the extremes of given time intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the expectation-maximization algorithm to the phase I output. By selecting a set of overdispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Vilanova, Pedro
2016-01-07
In this work, we present an extension of the forward-reverse representation introduced in Simulation of forward-reverse stochastic representations for conditional diffusions , a 2014 paper by Bayer and Schoenmakers to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization algorithm to the phase I output. By selecting a set of over-dispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Inferring microbial interaction networks from metagenomic data using SgLV-EKF algorithm.
Alshawaqfeh, Mustafa; Serpedin, Erchin; Younes, Ahmad Bani
2017-03-27
Inferring the microbial interaction networks (MINs) and modeling their dynamics are critical in understanding the mechanisms of the bacterial ecosystem and designing antibiotic and/or probiotic therapies. Recently, several approaches were proposed to infer MINs using the generalized Lotka-Volterra (gLV) model. Main drawbacks of these models include the fact that these models only consider the measurement noise without taking into consideration the uncertainties in the underlying dynamics. Furthermore, inferring the MIN is characterized by the limited number of observations and nonlinearity in the regulatory mechanisms. Therefore, novel estimation techniques are needed to address these challenges. This work proposes SgLV-EKF: a stochastic gLV model that adopts the extended Kalman filter (EKF) algorithm to model the MIN dynamics. In particular, SgLV-EKF employs a stochastic modeling of the MIN by adding a noise term to the dynamical model to compensate for modeling uncertainties. This stochastic modeling is more realistic than the conventional gLV model which assumes that the MIN dynamics are perfectly governed by the gLV equations. After specifying the stochastic model structure, we propose the EKF to estimate the MIN. SgLV-EKF was compared with two similarity-based algorithms, one algorithm from the integral-based family and two regression-based algorithms, in terms of the achieved performance on two synthetic data-sets and two real data-sets. The first data-set models the randomness in measurement data, whereas, the second data-set incorporates uncertainties in the underlying dynamics. The real data-sets are provided by a recent study pertaining to an antibiotic-mediated Clostridium difficile infection. The experimental results demonstrate that SgLV-EKF outperforms the alternative methods in terms of robustness to measurement noise, modeling errors, and tracking the dynamics of the MIN. Performance analysis demonstrates that the proposed SgLV-EKF algorithm
Directory of Open Access Journals (Sweden)
Gaia Tavoni
2017-10-01
Full Text Available Functional coupling networks are widely used to characterize collective patterns of activity in neural populations. Here, we ask whether functional couplings reflect the subtle changes, such as in physiological interactions, believed to take place during learning. We infer functional network models reproducing the spiking activity of simultaneously recorded neurons in prefrontal cortex (PFC of rats, during the performance of a cross-modal rule shift task (task epoch, and during preceding and following sleep epochs. A large-scale study of the 96 recorded sessions allows us to detect, in about 20% of sessions, effective plasticity between the sleep epochs. These coupling modifications are correlated with the coupling values in the task epoch, and are supported by a small subset of the recorded neurons, which we identify by means of an automatized procedure. These potentiated groups increase their coativation frequency in the spiking data between the two sleep epochs, and, hence, participate to putative experience-related cell assemblies. Study of the reactivation dynamics of the potentiated groups suggests a possible connection with behavioral learning. Reactivation is largely driven by hippocampal ripple events when the rule is not yet learned, and may be much more autonomous, and presumably sustained by the potentiated PFC network, when learning is consolidated. Cell assemblies coding for memories are widely believed to emerge through synaptic modification resulting from learning, yet their identification from activity is very arduous. We propose a functional-connectivity-based approach to identify experience-related cell assemblies from multielectrode recordings in vivo, and apply it to the prefrontal cortex activity of rats recorded during a task epoch and the preceding and following sleep epochs. We infer functional couplings between the recorded cells in each epoch. Comparisons of the functional coupling networks across the epochs allow us
Wang, Yi Kan; Hurley, Daniel G; Schnell, Santiago; Print, Cristin G; Crampin, Edmund J
2013-01-01
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.
Directory of Open Access Journals (Sweden)
Shohag Barman
Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.
A dynamic discretization method for reliability inference in Dynamic Bayesian Networks
International Nuclear Information System (INIS)
Zhu, Jiandao; Collette, Matthew
2015-01-01
The material and modeling parameters that drive structural reliability analysis for marine structures are subject to a significant uncertainty. This is especially true when time-dependent degradation mechanisms such as structural fatigue cracking are considered. Through inspection and monitoring, information such as crack location and size can be obtained to improve these parameters and the corresponding reliability estimates. Dynamic Bayesian Networks (DBNs) are a powerful and flexible tool to model dynamic system behavior and update reliability and uncertainty analysis with life cycle data for problems such as fatigue cracking. However, a central challenge in using DBNs is the need to discretize certain types of continuous random variables to perform network inference while still accurately tracking low-probability failure events. Most existing discretization methods focus on getting the overall shape of the distribution correct, with less emphasis on the tail region. Therefore, a novel scheme is presented specifically to estimate the likelihood of low-probability failure events. The scheme is an iterative algorithm which dynamically partitions the discretization intervals at each iteration. Through applications to two stochastic crack-growth example problems, the algorithm is shown to be robust and accurate. Comparisons are presented between the proposed approach and existing methods for the discretization problem. - Highlights: • A dynamic discretization method is developed for low-probability events in DBNs. • The method is compared to existing approaches on two crack growth problems. • The method is shown to improve on existing methods for low-probability events
Inference of the oxidative stress network in Anopheles stephensi upon Plasmodium infection.
Shrinet, Jatin; Nandal, Umesh Kumar; Adak, Tridibes; Bhatnagar, Raj K; Sunil, Sujatha
2014-01-01
Ookinete invasion of Anopheles midgut is a critical step for malaria transmission; the parasite numbers drop drastically and practically reach a minimum during the parasite's whole life cycle. At this stage, the parasite as well as the vector undergoes immense oxidative stress. Thereafter, the vector undergoes oxidative stress at different time points as the parasite invades its tissues during the parasite development. The present study was undertaken to reconstruct the network of differentially expressed genes involved in oxidative stress in Anopheles stephensi during Plasmodium development and maturation in the midgut. Using high throughput next generation sequencing methods, we generated the transcriptome of the An. stephensi midgut during Plasmodium vinckei petteri oocyst invasion of the midgut epithelium. Further, we utilized large datasets available on public domain on Anopheles during Plasmodium ookinete invasion and Drosophila datasets and arrived upon clusters of genes that may play a role in oxidative stress. Finally, we used support vector machines for the functional prediction of the un-annotated genes of An. stephensi. Integrating the results from all the different data analyses, we identified a total of 516 genes that were involved in oxidative stress in An. stephensi during Plasmodium development. The significantly regulated genes were further extracted from this gene cluster and used to infer an oxidative stress network of An. stephensi. Using system biology approaches, we have been able to ascertain the role of several putative genes in An. stephensi with respect to oxidative stress. Further experimental validations of these genes are underway.
Premature ventricular contraction detection combining deep neural networks and rules inference.
Zhou, Fei-Yan; Jin, Lin-Peng; Dong, Jun
2017-06-01
Premature ventricular contraction (PVC), which is a common form of cardiac arrhythmia caused by ectopic heartbeat, can lead to life-threatening cardiac conditions. Computer-aided PVC detection is of considerable importance in medical centers or outpatient ECG rooms. In this paper, we proposed a new approach that combined deep neural networks and rules inference for PVC detection. The detection performance and generalization were studied using publicly available databases: the MIT-BIH arrhythmia database (MIT-BIH-AR) and the Chinese Cardiovascular Disease Database (CCDD). The PVC detection accuracy on the MIT-BIH-AR database was 99.41%, with a sensitivity and specificity of 97.59% and 99.54%, respectively, which were better than the results from other existing methods. To test the generalization capability, the detection performance was also evaluated on the CCDD. The effectiveness of the proposed method was confirmed by the accuracy (98.03%), sensitivity (96.42%) and specificity (98.06%) with the dataset over 140,000 ECG recordings of the CCDD. Copyright © 2017 Elsevier B.V. All rights reserved.
Baur, Brittany; Bozdag, Serdar
2015-04-01
One of the challenging and important computational problems in systems biology is to infer gene regulatory networks (GRNs) of biological systems. Several methods that exploit gene expression data have been developed to tackle this problem. In this study, we propose the use of copy number and DNA methylation data to infer GRNs. We developed an algorithm that scores regulatory interactions between genes based on canonical correlation analysis. In this algorithm, copy number or DNA methylation variables are treated as potential regulator variables, and expression variables are treated as potential target variables. We first validated that the canonical correlation analysis method is able to infer true interactions in high accuracy. We showed that the use of DNA methylation or copy number datasets leads to improved inference over steady-state expression. Our results also showed that epigenetic and structural information could be used to infer directionality of regulatory interactions. Additional improvements in GRN inference can be gleaned from incorporating the result in an informative prior in a dynamic Bayesian algorithm. This is the first study that incorporates copy number and DNA methylation into an informative prior in dynamic Bayesian framework. By closely examining top-scoring interactions with different sources of epigenetic or structural information, we also identified potential novel regulatory interactions.
Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V. E.; Stefanovska, Aneta
2012-12-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.024101 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Bayer, Christian; Moraes, Alvaro; Tempone, Raul; Vilanova, Pedro
2016-01-01
then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve
Directory of Open Access Journals (Sweden)
Sudip Misra
2010-04-01
Full Text Available The proposed mechanism for jamming attack detection for wireless sensor networks is novel in three respects: firstly, it upgrades the jammer to include versatile military jammers; secondly, it graduates from the existing node-centric detection system to the network-centric system making it robust and economical at the nodes, and thirdly, it tackles the problem through fuzzy inference system, as the decision regarding intensity of jamming is seldom crisp. The system with its high robustness, ability to grade nodes with jamming indices, and its true-detection rate as high as 99.8%, is worthy of consideration for information warfare defense purposes.
Directory of Open Access Journals (Sweden)
Yanzhu Hu
2016-09-01
Full Text Available Complex network methodology is very useful for complex system exploration. However, the relationships among variables in complex systems are usually not clear. Therefore, inferring association networks among variables from their observed data has been a popular research topic. We propose a method, named small-shuffle symbolic transfer entropy spectrum (SSSTES, for inferring association networks from multivariate time series. The method can solve four problems for inferring association networks, i.e., strong correlation identification, correlation quantification, direction identification and temporal relation identification. The method can be divided into four layers. The first layer is the so-called data layer. Data input and processing are the things to do in this layer. In the second layer, we symbolize the model data, original data and shuffled data, from the previous layer and calculate circularly transfer entropy with different time lags for each pair of time series variables. Thirdly, we compose transfer entropy spectrums for pairwise time series with the previous layer’s output, a list of transfer entropy matrix. We also identify the correlation level between variables in this layer. In the last layer, we build a weighted adjacency matrix, the value of each entry representing the correlation level between pairwise variables, and then get the weighted directed association network. Three sets of numerical simulated data from a linear system, a nonlinear system and a coupled Rossler system are used to show how the proposed approach works. Finally, we apply SSSTES to a real industrial system and get a better result than with two other methods.
Energy Technology Data Exchange (ETDEWEB)
Sadeh, Javad; Afradi, Hamid [Electrical Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, P.O. Box: 91775-1111, Mashhad (Iran)
2009-11-15
This paper presents a new and accurate algorithm for locating faults in a combined overhead transmission line with underground power cable using Adaptive Network-Based Fuzzy Inference System (ANFIS). The proposed method uses 10 ANFIS networks and consists of 3 stages, including fault type classification, faulty section detection and exact fault location. In the first part, an ANFIS is used to determine the fault type, applying four inputs, i.e., fundamental component of three phase currents and zero sequence current. Another ANFIS network is used to detect the faulty section, whether the fault is on the overhead line or on the underground cable. Other eight ANFIS networks are utilized to pinpoint the faults (two for each fault type). Four inputs, i.e., the dc component of the current, fundamental frequency of the voltage and current and the angle between them, are used to train the neuro-fuzzy inference systems in order to accurately locate the faults on each part of the combined line. The proposed method is evaluated under different fault conditions such as different fault locations, different fault inception angles and different fault resistances. Simulation results confirm that the proposed method can be used as an efficient means for accurate fault location on the combined transmission lines. (author)
Directory of Open Access Journals (Sweden)
Dong Ling Tong
Full Text Available OBJECTIVE: To model the potential interaction between previously identified biomarkers in children sarcomas using artificial neural network inference (ANNI. METHOD: To concisely demonstrate the biological interactions between correlated genes in an interaction network map, only 2 types of sarcomas in the children small round blue cell tumors (SRBCTs dataset are discussed in this paper. A backpropagation neural network was used to model the potential interaction between genes. The prediction weights and signal directions were used to model the strengths of the interaction signals and the direction of the interaction link between genes. The ANN model was validated using Monte Carlo cross-validation to minimize the risk of over-fitting and to optimize generalization ability of the model. RESULTS: Strong connection links on certain genes (TNNT1 and FNDC5 in rhabdomyosarcoma (RMS; FCGRT and OLFM1 in Ewing's sarcoma (EWS suggested their potency as central hubs in the interconnection of genes with different functionalities. The results showed that the RMS patients in this dataset are likely to be congenital and at low risk of cardiomyopathy development. The EWS patients are likely to be complicated by EWS-FLI fusion and deficiency in various signaling pathways, including Wnt, Fas/Rho and intracellular oxygen. CONCLUSIONS: The ANN network inference approach and the examination of identified genes in the published literature within the context of the disease highlights the substantial influence of certain genes in sarcomas.
Gao, Zhong-Ke; Cai, Qing; Dong, Na; Zhang, Shan-Shan; Bo, Yun; Zhang, Jie
2016-10-01
Distinguishing brain cognitive behavior underlying disabled and able-bodied subjects constitutes a challenging problem of significant importance. Complex network has established itself as a powerful tool for exploring functional brain networks, which sheds light on the inner workings of the human brain. Most existing works in constructing brain network focus on phase-synchronization measures between regional neural activities. In contrast, we propose a novel approach for inferring functional networks from P300 event-related potentials by integrating time and frequency domain information extracted from each channel signal, which we show to be efficient in subsequent pattern recognition. In particular, we construct brain network by regarding each channel signal as a node and determining the edges in terms of correlation of the extracted feature vectors. A six-choice P300 paradigm with six different images is used in testing our new approach, involving one able-bodied subject and three disabled subjects suffering from multiple sclerosis, cerebral palsy, traumatic brain and spinal-cord injury, respectively. We then exploit global efficiency, local efficiency and small-world indices from the derived brain networks to assess the network topological structure associated with different target images. The findings suggest that our method allows identifying brain cognitive behaviors related to visual stimulus between able-bodied and disabled subjects.
Modeling of a 5-cell direct methanol fuel cell using adaptive-network-based fuzzy inference systems
Wang, Rongrong; Qi, Liang; Xie, Xiaofeng; Ding, Qingqing; Li, Chunwen; Ma, ChenChi M.
The methanol concentrations, temperature and current were considered as inputs, the cell voltage was taken as output, and the performance of a direct methanol fuel cell (DMFC) was modeled by adaptive-network-based fuzzy inference systems (ANFIS). The artificial neural network (ANN) and polynomial-based models were selected to be compared with the ANFIS in respect of quality and accuracy. Based on the ANFIS model obtained, the characteristics of the DMFC were studied. The results show that temperature and methanol concentration greatly affect the performance of the DMFC. Within a restricted current range, the methanol concentration does not greatly affect the stack voltage. In order to obtain higher fuel utilization efficiency, the methanol concentrations and temperatures should be adjusted according to the load on the system.
Energy Technology Data Exchange (ETDEWEB)
Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)
2010-11-15
There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)
Modeling of a 5-cell direct methanol fuel cell using adaptive-network-based fuzzy inference systems
Energy Technology Data Exchange (ETDEWEB)
Wang, Rongrong; Li, Chunwen [Department of Automation, Tsinghua University, Beijing 100084 (China); Qi, Liang; Xie, Xiaofeng [Institute of Nuclear and New Energy Technology, Tsinghua University, Beijing 100084 (China); Ding, Qingqing [Department of Electrical Engineering, Tsinghua University, Beijing 100084 (China); Ma, ChenChi M. [National Tsing Hua University, Hsinchu 300 (China)
2008-12-01
The methanol concentrations, temperature and current were considered as inputs, the cell voltage was taken as output, and the performance of a direct methanol fuel cell (DMFC) was modeled by adaptive-network-based fuzzy inference systems (ANFIS). The artificial neural network (ANN) and polynomial-based models were selected to be compared with the ANFIS in respect of quality and accuracy. Based on the ANFIS model obtained, the characteristics of the DMFC were studied. The results show that temperature and methanol concentration greatly affect the performance of the DMFC. Within a restricted current range, the methanol concentration does not greatly affect the stack voltage. In order to obtain higher fuel utilization efficiency, the methanol concentrations and temperatures should be adjusted according to the load on the system. (author)
Directory of Open Access Journals (Sweden)
Ignat Drozdov
Full Text Available Small intestinal (SI neuroendocrine tumors (NET are increasing in incidence, however little is known about their biology. High throughput techniques such as inference of gene regulatory networks from microarray experiments can objectively define signaling machinery in this disease. Genome-wide co-expression analysis was used to infer gene relevance network in SI-NETs. The network was confirmed to be non-random, scale-free, and highly modular. Functional analysis of gene co-expression modules revealed processes including 'Nervous system development', 'Immune response', and 'Cell-cycle'. Importantly, gene network topology and differential expression analysis identified over-expression of the GPCR signaling regulators, the cAMP synthetase, ADCY2, and the protein kinase A, PRKAR1A. Seven CREB response element (CRE transcripts associated with proliferation and secretion: BEX1, BICD1, CHGB, CPE, GABRB3, SCG2 and SCG3 as well as ADCY2 and PRKAR1A were measured in an independent SI dataset (n = 10 NETs; n = 8 normal preparations. All were up-regulated (p<0.035 with the exception of SCG3 which was not differently expressed. Forskolin (a direct cAMP activator, 10(-5 M significantly stimulated transcription of pCREB and 3/7 CREB targets, isoproterenol (a selective ß-adrenergic receptor agonist and cAMP activator, 10(-5 M stimulated pCREB and 4/7 targets while BIM-53061 (a dopamine D(2 and Serotonin [5-HT(2] receptor agonist, 10(-6 M stimulated 100% of targets as well as pCREB; CRE transcription correlated with the levels of cAMP accumulation and PKA activity; BIM-53061 stimulated the highest levels of cAMP and PKA (2.8-fold and 2.5-fold vs. 1.8-2-fold for isoproterenol and forskolin. Gene network inference and graph topology analysis in SI NETs suggests that SI NETs express neural GPCRs that activate different CRE targets associated with proliferation and secretion. In vitro studies, in a model NET cell system, confirmed that transcriptional
Inferring network structure in non-normal and mixed discrete-continuous genomic data.
Bhadra, Anindya; Rao, Arvind; Baladandayuthapani, Veerabhadran
2018-03-01
Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach. © 2017, The International Biometric Society.
Directory of Open Access Journals (Sweden)
P. Bhattacharya
2007-11-01
Full Text Available To achieve an effective and safe operation on the machine system where the human interacts with the machine mutually, there is a need for the machine to understand the human state, especially cognitive state, when the human's operation task demands an intensive cognitive activity. Due to a well-known fact with the human being, a highly uncertain cognitive state and behavior as well as expressions or cues, the recent trend to infer the human state is to consider multimodality features of the human operator. In this paper, we present a method for multimodality inferring of human cognitive states by integrating neuro-fuzzy network and information fusion techniques. To demonstrate the effectiveness of this method, we take the driver fatigue detection as an example. The proposed method has, in particular, the following new features. First, human expressions are classified into four categories: (i casual or contextual feature, (ii contact feature, (iii contactless feature, and (iv performance feature. Second, the fuzzy neural network technique, in particular Takagi-Sugeno-Kang (TSK model, is employed to cope with uncertain behaviors. Third, the sensor fusion technique, in particular ordered weighted aggregation (OWA, is integrated with the TSK model in such a way that cues are taken as inputs to the TSK model, and then the outputs of the TSK are fused by the OWA which gives outputs corresponding to particular cognitive states under interest (e.g., fatigue. We call this method TSK-OWA. Validation of the TSK-OWA, performed in the Northeastern University vehicle drive simulator, has shown that the proposed method is promising to be a general tool for human cognitive state inferring and a special tool for the driver fatigue detection.
Inferring the physical connectivity of complex networks from their functional dynamics
Directory of Open Access Journals (Sweden)
Holm Liisa
2010-05-01
Full Text Available Abstract Background Biological networks, such as protein-protein interactions, metabolic, signalling, transcription-regulatory networks and neural synapses, are representations of large-scale dynamic systems. The relationship between the network structure and functions remains one of the central problems in current multidisciplinary research. Significant progress has been made toward understanding the implication of topological features for the network dynamics and functions, especially in biological networks. Given observations of a network system's behaviours or measurements of its functional dynamics, what can we conclude of the details of physical connectivity of the underlying structure? Results We modelled the network system by employing a scale-free network of coupled phase oscillators. Pairwise phase coherence (PPC was calculated for all the pairs of oscillators to present functional dynamics induced by the system. At the regime of global incoherence, we observed a Significant pairwise synchronization only between two nodes that are physically connected. Right after the onset of global synchronization, disconnected nodes begin to oscillate in a correlated fashion and the PPC of two nodes, either connected or disconnected, depends on their degrees. Based on the observation of PPCs, we built a weighted network of synchronization (WNS, an all-to-all functionally connected network where each link is weighted by the PPC of two oscillators at the ends of the link. In the regime of strong coupling, we observed a Significant similarity in the organization of WNSs induced by systems sharing the same substrate network but different configurations of initial phases and intrinsic frequencies of oscillators. We reconstruct physical network from the WNS by choosing the links whose weights are higher than a given threshold. We observed an optimal reconstruction just before the onset of global synchronization. Finally, we correlated the topology of the
Jin, Nana; Wu, Deng; Gong, Yonghui; Bi, Xiaoman; Jiang, Hong; Li, Kongning; Wang, Qianghu
2014-01-01
An increasing number of experiments have been designed to detect intracellular and intercellular molecular interactions. Based on these molecular interactions (especially protein interactions), molecular networks have been built for using in several typical applications, such as the discovery of new disease genes and the identification of drug targets and molecular complexes. Because the data are incomplete and a considerable number of false-positive interactions exist, protein interactions from different sources are commonly integrated in network analyses to build a stable molecular network. Although various types of integration strategies are being applied in current studies, the topological properties of the networks from these different integration strategies, especially typical applications based on these network integration strategies, have not been rigorously evaluated. In this paper, systematic analyses were performed to evaluate 11 frequently used methods using two types of integration strategies: empirical and machine learning methods. The topological properties of the networks of these different integration strategies were found to significantly differ. Moreover, these networks were found to dramatically affect the outcomes of typical applications, such as disease gene predictions, drug target detections, and molecular complex identifications. The analysis presented in this paper could provide an important basis for future network-based biological researches. PMID:25243127
The fidelity of Kepler eclipsing binary parameters inferred by the neural network
Holanda, N.; da Silva, J. R. P.
2018-04-01
This work aims to test the fidelity and efficiency of obtaining automatic orbital elements of eclipsing binary systems, from light curves using neural network models. We selected a random sample with 78 systems, from over 1400 eclipsing binary detached obtained from the Kepler Eclipsing Binaries Catalog, processed using the neural network approach. The orbital parameters of the sample systems were measured applying the traditional method of light curve adjustment with uncertainties calculated by the bootstrap method, employing the JKTEBOP code. These estimated parameters were compared with those obtained by the neural network approach for the same systems. The results reveal a good agreement between techniques for the sum of the fractional radii and moderate agreement for e cos ω and e sin ω, but orbital inclination is clearly underestimated in neural network tests.
Moraes, Alvaro
2015-01-01
Networks (SRNs), that are intended to describe the time evolution of interacting particle systems where one particle interacts with the others through a finite set of reaction channels. SRNs have been mainly developed to model biochemical reactions
Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu
2017-08-01
Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.
Random Walks on Directed Networks: Inference and Respondent-Driven Sampling
Directory of Open Access Journals (Sweden)
Malmros Jens
2016-06-01
Full Text Available Respondent-driven sampling (RDS is often used to estimate population properties (e.g., sexual risk behavior in hard-to-reach populations. In RDS, already sampled individuals recruit population members to the sample from their social contacts in an efficient snowball-like sampling procedure. By assuming a Markov model for the recruitment of individuals, asymptotically unbiased estimates of population characteristics can be obtained. Current RDS estimation methodology assumes that the social network is undirected, that is, all edges are reciprocal. However, empirical social networks in general also include a substantial number of nonreciprocal edges. In this article, we develop an estimation method for RDS in populations connected by social networks that include reciprocal and nonreciprocal edges. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing edges of sampled individuals. The proposed estimators are evaluated on artificial and empirical networks and are shown to generally perform better than existing estimators. This is the case in particular when the fraction of directed edges in the network is large.
dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data.
Huynh-Thu, Vân Anh; Geurts, Pierre
2018-02-21
The elucidation of gene regulatory networks is one of the major challenges of systems biology. Measurements about genes that are exploited by network inference methods are typically available either in the form of steady-state expression vectors or time series expression data. In our previous work, we proposed the GENIE3 method that exploits variable importance scores derived from Random forests to identify the regulators of each target gene. This method provided state-of-the-art performance on several benchmark datasets, but it could however not specifically be applied to time series expression data. We propose here an adaptation of the GENIE3 method, called dynamical GENIE3 (dynGENIE3), for handling both time series and steady-state expression data. The proposed method is evaluated extensively on the artificial DREAM4 benchmarks and on three real time series expression datasets. Although dynGENIE3 does not systematically yield the best performance on each and every network, it is competitive with diverse methods from the literature, while preserving the main advantages of GENIE3 in terms of scalability.
Czech Academy of Sciences Publication Activity Database
Djordjilović, V.; Chiogna, M.; Vomlel, Jiří
2017-01-01
Roč. 88, č. 1 (2017), s. 602-613 ISSN 0888-613X R&D Projects: GA ČR(CZ) GA16-12010S Institutional support: RVO:67985556 Keywords : Bayesian networks * Structure learning * Reverse engineering * Gene networks Subject RIV: JD - Computer Applications, Robotics OBOR OECD: Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8) Impact factor: 2.845, year: 2016 http://library.utia.cas.cz/separaty/2017/MTR/vomlel-0477168.pdf
Vilanova, Pedro
2016-01-01
reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ
Inferring the interplay between network structure and market effects in Bitcoin
International Nuclear Information System (INIS)
Kondor, Dániel; Csabai, István; Szüle, János; Pósfai, Márton; Vattay, Gábor
2014-01-01
A main focus in economics research is understanding the time series of prices of goods and assets. While statistical models using only the properties of the time series itself have been successful in many aspects, we expect to gain a better understanding of the phenomena involved if we can model the underlying system of interacting agents. In this article, we consider the history of Bitcoin, a novel digital currency system, for which the complete list of transactions is available for analysis. Using this dataset, we reconstruct the transaction network between users and analyze changes in the structure of the subgraph induced by the most active users. Our approach is based on the unsupervised identification of important features of the time variation of the network. Applying the widely used method of Principal Component Analysis to the matrix constructed from snapshots of the network at different times, we are able to show how structural changes in the network accompany significant changes in the exchange price of bitcoins. (paper)
Inferring the interplay between network structure and market effects in Bitcoin
Kondor, Dániel; Csabai, István; Szüle, János; Pósfai, Márton; Vattay, Gábor
2014-12-01
A main focus in economics research is understanding the time series of prices of goods and assets. While statistical models using only the properties of the time series itself have been successful in many aspects, we expect to gain a better understanding of the phenomena involved if we can model the underlying system of interacting agents. In this article, we consider the history of Bitcoin, a novel digital currency system, for which the complete list of transactions is available for analysis. Using this dataset, we reconstruct the transaction network between users and analyze changes in the structure of the subgraph induced by the most active users. Our approach is based on the unsupervised identification of important features of the time variation of the network. Applying the widely used method of Principal Component Analysis to the matrix constructed from snapshots of the network at different times, we are able to show how structural changes in the network accompany significant changes in the exchange price of bitcoins.
Reliability of Inference of Directed Climate Networks Using Conditional Mutual Information
Czech Academy of Sciences Publication Activity Database
Hlinka, Jaroslav; Hartman, David; Vejmelka, Martin; Runge, J.; Marwan, N.; Kurths, J.; Paluš, Milan
2013-01-01
Roč. 15, č. 6 (2013), s. 2023-2045 ISSN 1099-4300 R&D Projects: GA ČR GCP103/11/J068 Institutional support: RVO:67985807 Keywords : causality * climate * nonlinearity * transfer entropy * network * stability Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.564, year: 2013
Inference of directed climate networks: role of instability of causality estimation methods
Czech Academy of Sciences Publication Activity Database
Hlinka, Jaroslav; Hartman, David; Vejmelka, Martin; Paluš, Milan
2013-01-01
Roč. 15, - (2013), s. 12987 ISSN 1607-7962. [European Geosciences Union General Assembly 2013. 07.04.2013-12.04.2013, Vienna] R&D Projects: GA ČR GCP103/11/J068 Institutional support: RVO:67985807 Keywords : causality analysis * climate networks Subject RIV: BB - Applied Statistics, Operational Research
Inferring monopartite projections of bipartite networks: an entropy-based approach
Saracco, Fabio; Straka, Mika J.; Di Clemente, Riccardo; Gabrielli, Andrea; Caldarelli, Guido; Squartini, Tiziano
2017-05-01
Bipartite networks are currently regarded as providing a major insight into the organization of many real-world systems, unveiling the mechanisms driving the interactions occurring between distinct groups of nodes. One of the most important issues encountered when modeling bipartite networks is devising a way to obtain a (monopartite) projection on the layer of interest, which preserves as much as possible the information encoded into the original bipartite structure. In the present paper we propose an algorithm to obtain statistically-validated projections of bipartite networks, according to which any two nodes sharing a statistically-significant number of neighbors are linked. Since assessing the statistical significance of nodes similarity requires a proper statistical benchmark, here we consider a set of four null models, defined within the exponential random graph framework. Our algorithm outputs a matrix of link-specific p-values, from which a validated projection is straightforwardly obtainable, upon running a multiple hypothesis testing procedure. Finally, we test our method on an economic network (i.e. the countries-products World Trade Web representation) and a social network (i.e. MovieLens, collecting the users’ ratings of a list of movies). In both cases non-trivial communities are detected: while projecting the World Trade Web on the countries layer reveals modules of similarly-industrialized nations, projecting it on the products layer allows communities characterized by an increasing level of complexity to be detected; in the second case, projecting MovieLens on the films layer allows clusters of movies whose affinity cannot be fully accounted for by genre similarity to be individuated.
Inferring phylogenetic networks by the maximum parsimony criterion: a case study.
Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir
2007-01-01
Horizontal gene transfer (HGT) may result in genes whose evolutionary histories disagree with each other, as well as with the species tree. In this case, reconciling the species and gene trees results in a network of relationships, known as the "phylogenetic network" of the set of species. A phylogenetic network that incorporates HGT consists of an underlying species tree that captures vertical inheritance and a set of edges which model the "horizontal" transfer of genetic material. In a series of papers, Nakhleh and colleagues have recently formulated a maximum parsimony (MP) criterion for phylogenetic networks, provided an array of computationally efficient algorithms and heuristics for computing it, and demonstrated its plausibility on simulated data. In this article, we study the performance and robustness of this criterion on biological data. Our findings indicate that MP is very promising when its application is extended to the domain of phylogenetic network reconstruction and HGT detection. In all cases we investigated, the MP criterion detected the correct number of HGT events required to map the evolutionary history of a gene data set onto the species phylogeny. Furthermore, our results indicate that the criterion is robust with respect to both incomplete taxon sampling and the use of different site substitution matrices. Finally, our results show that the MP criterion is very promising in detecting HGT in chimeric genes, whose evolutionary histories are a mix of vertical and horizontal evolution. Besides the performance analysis of MP, our findings offer new insights into the evolution of 4 biological data sets and new possible explanations of HGT scenarios in their evolutionary history.
Inferring pregnancy episodes and outcomes within a network of observational databases.
Directory of Open Access Journals (Sweden)
Amy Matcho
Full Text Available Administrative claims and electronic health records are valuable resources for evaluating pharmaceutical effects during pregnancy. However, direct measures of gestational age are generally not available. Establishing a reliable approach to infer the duration and outcome of a pregnancy could improve pharmacovigilance activities. We developed and applied an algorithm to define pregnancy episodes in four observational databases: three US-based claims databases: Truven MarketScan® Commercial Claims and Encounters (CCAE, Truven MarketScan® Multi-state Medicaid (MDCD, and the Optum ClinFormatics® (Optum database and one non-US database, the United Kingdom (UK based Clinical Practice Research Datalink (CPRD. Pregnancy outcomes were classified as live births, stillbirths, abortions and ectopic pregnancies. Start dates were estimated using a derived hierarchy of available pregnancy markers, including records such as last menstrual period and nuchal ultrasound dates. Validation included clinical adjudication of 700 electronic Optum and CPRD pregnancy episode profiles to assess the operating characteristics of the algorithm, and a comparison of the algorithm's Optum pregnancy start estimates to starts based on dates of assisted conception procedures. Distributions of pregnancy outcome types were similar across all four data sources and pregnancy episode lengths found were as expected for all outcomes, excepting term lengths in episodes that used amenorrhea and urine pregnancy tests for start estimation. Validation survey results found highest agreement between reviewer chosen and algorithm operating characteristics for questions assessing pregnancy status and accuracy of outcome category with 99-100% agreement for Optum and CPRD. Outcome date agreement within seven days in either direction ranged from 95-100%, while start date agreement within seven days in either direction ranged from 90-97%. In Optum validation sensitivity analysis, a total of 73% of
Inferring pregnancy episodes and outcomes within a network of observational databases
Ryan, Patrick; Fife, Daniel; Gifkins, Dina; Knoll, Chris; Friedman, Andrew
2018-01-01
Administrative claims and electronic health records are valuable resources for evaluating pharmaceutical effects during pregnancy. However, direct measures of gestational age are generally not available. Establishing a reliable approach to infer the duration and outcome of a pregnancy could improve pharmacovigilance activities. We developed and applied an algorithm to define pregnancy episodes in four observational databases: three US-based claims databases: Truven MarketScan® Commercial Claims and Encounters (CCAE), Truven MarketScan® Multi-state Medicaid (MDCD), and the Optum ClinFormatics® (Optum) database and one non-US database, the United Kingdom (UK) based Clinical Practice Research Datalink (CPRD). Pregnancy outcomes were classified as live births, stillbirths, abortions and ectopic pregnancies. Start dates were estimated using a derived hierarchy of available pregnancy markers, including records such as last menstrual period and nuchal ultrasound dates. Validation included clinical adjudication of 700 electronic Optum and CPRD pregnancy episode profiles to assess the operating characteristics of the algorithm, and a comparison of the algorithm’s Optum pregnancy start estimates to starts based on dates of assisted conception procedures. Distributions of pregnancy outcome types were similar across all four data sources and pregnancy episode lengths found were as expected for all outcomes, excepting term lengths in episodes that used amenorrhea and urine pregnancy tests for start estimation. Validation survey results found highest agreement between reviewer chosen and algorithm operating characteristics for questions assessing pregnancy status and accuracy of outcome category with 99–100% agreement for Optum and CPRD. Outcome date agreement within seven days in either direction ranged from 95–100%, while start date agreement within seven days in either direction ranged from 90–97%. In Optum validation sensitivity analysis, a total of 73% of
Inferring Social Isolation in Older Adults through Ambient Intelligence and Social Networking Sites
Campos, Wilfrido; Martinez, Alicia; Sanchez, Wendy; Estrada, Hugo; Favela, Jesus; Perez, Joaquin
2016-01-01
Abstract Early diagnosis of social isolation in older adults can prevent physical and cognitive impairment or further impoverishment of their social network. This diagnosis is usually performed by personal and periodic application of psychological assessment instruments. This situation encourages the development of novel approaches able to monitor risk situations in social interactions to obtain early diagnosis and implement appropriate measures. This paper presents the development of a predi...
Inferring social status and rich club effects in enterprise communication networks.
Dong, Yuxiao; Tang, Jie; Chawla, Nitesh V; Lou, Tiancheng; Yang, Yang; Wang, Bai
2015-01-01
Social status, defined as the relative rank or position that an individual holds in a social hierarchy, is known to be among the most important motivating forces in social behaviors. In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks. To that end, we use two enterprise datasets with three communication channels--voice call, short message, and email--to demonstrate the social-behavioral differences among individuals with different status. We have several interesting findings and based on these findings we also develop a model to predict social status. On the individual level, high-status individuals are more likely to be spanned as structural holes by linking to people in parts of the enterprise networks that are otherwise not well connected to one another. On the community level, the principle of homophily, social balance and clique theory generally indicate a "rich club" maintained by high-status individuals, in the sense that this community is much more connected, balanced and dense. Our model can predict social status of individuals with 93% accuracy.
Mc Mahon, Siobhan S; Sim, Aaron; Filippi, Sarah; Johnson, Robert; Liepe, Juliane; Smith, Dominic; Stumpf, Michael P H
2014-11-01
Sensing and responding to the environment are two essential functions that all biological organisms need to master for survival and successful reproduction. Developmental processes are marshalled by a diverse set of signalling and control systems, ranging from systems with simple chemical inputs and outputs to complex molecular and cellular networks with non-linear dynamics. Information theory provides a powerful and convenient framework in which such systems can be studied; but it also provides the means to reconstruct the structure and dynamics of molecular interaction networks underlying physiological and developmental processes. Here we supply a brief description of its basic concepts and introduce some useful tools for systems and developmental biologists. Along with a brief but thorough theoretical primer, we demonstrate the wide applicability and biological application-specific nuances by way of different illustrative vignettes. In particular, we focus on the characterisation of biological information processing efficiency, examining cell-fate decision making processes, gene regulatory network reconstruction, and efficient signal transduction experimental design. Copyright © 2014 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Jung-Shyr Wu
2012-01-01
Full Text Available CAC (Call Admission Control plays a significant role in providing QoS (Quality of Service in mobile wireless networks. In addition to much research that focuses on modified Mobile IP to get better efficient handover performance, CAC should be introduced to Mobile IP-based network to guarantee the QoS for users. In this paper, we propose a CAC scheme which incorporates multiple traffic types and adjusts the admission threshold dynamically using fuzzy control logic to achieve better usage of resources. The method can provide QoS in Mobile IPv6 networks with few modifications on MAP (Mobility Anchor Point functionality and slight change in BU (Binding Update message formats. According to the simulation results, the proposed scheme presents good performance of voice and video traffic at the expenses of poor performance on data traffic. It is evident that these CAC schemes can reduce the probability of the handoff dropping and the cell overload and limit the probability of the new call blocking.
Inferring the effective TOR-dependent network: a computational study in yeast.
Mohammadi, Shahin; Subramaniam, Shankar; Grama, Ananth
2013-08-30
Calorie restriction (CR) is one of the most conserved non-genetic interventions that extends healthspan in evolutionarily distant species, ranging from yeast to mammals. The target of rapamycin (TOR) has been shown to play a key role in mediating healthspan extension in response to CR by integrating different signals that monitor nutrient-availability and orchestrating various components of cellular machinery in response. Both genetic and pharmacological interventions that inhibit the TOR pathway exhibit a similar phenotype, which is not further amplified by CR. In this paper, we present the first comprehensive, computationally derived map of TOR downstream effectors, with the objective of discovering key lifespan mediators, their crosstalk, and high-level organization. We adopt a systematic approach for tracing information flow from the TOR complex and use it to identify relevant signaling elements. By constructing a high-level functional map of TOR downstream effectors, we show that our approach is not only capable of recapturing previously known pathways, but also suggests potential targets for future studies.Information flow scores provide an aggregate ranking of relevance of proteins with respect to the TOR signaling pathway. These rankings must be normalized for degree bias, appropriately interpreted, and mapped to associated roles in pathways. We propose a novel statistical framework for integrating information flow scores, the set of differentially expressed genes in response to rapamycin treatment, and the transcriptional regulatory network. We use this framework to identify the most relevant transcription factors in mediating the observed transcriptional response, and to construct the effective response network of the TOR pathway. This network is hypothesized to mediate life-span extension in response to TOR inhibition. Our approach, unlike experimental methods, is not limited to specific aspects of cellular response. Rather, it predicts transcriptional
Directory of Open Access Journals (Sweden)
Jonghoon Bae
Full Text Available This study examines whether the way that a person makes inferences about unknown events is associated with his or her social relations, more precisely, those characterized by ego network density that reflects the structure of a person's immediate social relation. From the analysis of individual predictions over the Go match between AlphaGo and Sedol Lee in March 2016 in Seoul, Korea, this study shows that the low-density group scored higher than the high-density group in the accuracy of the prediction over a future state of a social event, i.e., the outcome of the first game. We corroborated this finding with three replication tests that asked the participants to predict the following: film awards, President Park's impeachment in Korea, and the counterfactual assessment of the US presidential election. Taken together, this study suggests that network density is negatively associated with vision advantage, i.e., the ability to discover and forecast an unknown aspect of a social event.
International Nuclear Information System (INIS)
Azadeh, A.; Asadzadeh, S.M.; Ghanbari, A.
2010-01-01
Accurate short-term natural gas (NG) demand estimation and forecasting is vital for policy and decision-making process in energy sector. Moreover, conventional methods may not provide accurate results. This paper presents an adaptive network-based fuzzy inference system (ANFIS) for estimation of NG demand. Standard input variables are used which are day of the week, demand of the same day in previous year, demand of a day before and demand of 2 days before. The proposed ANFIS approach is equipped with pre-processing and post-processing concepts. Moreover, input data are pre-processed (scaled) and finally output data are post-processed (returned to its original scale). The superiority and applicability of the ANFIS approach is shown for Iranian NG consumption from 22/12/2007 to 30/6/2008. Results show that ANFIS provides more accurate results than artificial neural network (ANN) and conventional time series approach. The results of this study provide policy makers with an appropriate tool to make more accurate predictions on future short-term NG demand. This is because the proposed approach is capable of handling non-linearity, complexity as well as uncertainty that may exist in actual data sets due to erratic responses and measurement errors.
Brain networks for confidence weighting and hierarchical inference during probabilistic learning.
Meyniel, Florent; Dehaene, Stanislas
2017-05-09
Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This "confidence weighting" implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain's learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences.
Brain networks for confidence weighting and hierarchical inference during probabilistic learning
Meyniel, Florent; Dehaene, Stanislas
2017-01-01
Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This “confidence weighting” implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain’s learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences. PMID:28439014
Directory of Open Access Journals (Sweden)
Nicholas J Hudson
Full Text Available BACKGROUND: Despite modern technologies and novel computational approaches, decoding causal transcriptional regulation remains challenging. This is particularly true for less well studied organisms and when only gene expression data is available. In muscle a small number of well characterised transcription factors are proposed to regulate development. Therefore, muscle appears to be a tractable system for proposing new computational approaches. METHODOLOGY/PRINCIPAL FINDINGS: Here we report a simple algorithm that asks "which transcriptional regulator has the highest average absolute co-expression correlation to the genes in a co-expression module?" It correctly infers a number of known causal regulators of fundamental biological processes, including cell cycle activity (E2F1, glycolysis (HLF, mitochondrial transcription (TFB2M, adipogenesis (PIAS1, neuronal development (TLX3, immune function (IRF1 and vasculogenesis (SOX17, within a skeletal muscle context. However, none of the canonical pro-myogenic transcription factors (MYOD1, MYOG, MYF5, MYF6 and MEF2C were linked to muscle structural gene expression modules. Co-expression values were computed using developing bovine muscle from 60 days post conception (early foetal to 30 months post natal (adulthood for two breeds of cattle, in addition to a nutritional comparison with a third breed. A number of transcriptional landscapes were constructed and integrated into an always correlated landscape. One notable feature was a 'metabolic axis' formed from glycolysis genes at one end, nuclear-encoded mitochondrial protein genes at the other, and centrally tethered by mitochondrially-encoded mitochondrial protein genes. CONCLUSIONS/SIGNIFICANCE: The new module-to-regulator algorithm complements our recently described Regulatory Impact Factor analysis. Together with a simple examination of a co-expression module's contents, these three gene expression approaches are starting to illuminate the in vivo
Directory of Open Access Journals (Sweden)
Browne Fiona
2006-12-01
Full Text Available Protein-protein interactions (PPI play a key role in many biological systems. Over the past few years, an explosion in availability of functional biological data obtained from high-throughput technologies to infer PPI has been observed. However, results obtained from such experiments show high rates of false positives and false negatives predictions as well as systematic predictive bias. Recent research has revealed that several machine and statistical learning methods applied to integrate relatively weak, diverse sources of large-scale functional data may provide improved predictive accuracy and coverage of PPI. In this paper we describe the effects of applying different computational, integrative methods to predict PPI in Saccharomyces cerevisiae. We investigated the predictive ability of combining different sets of relatively strong and weak predictive datasets. We analysed several genomic datasets ranging from mRNA co-expression to marginal essentiality. Moreover, we expanded an existing multi-source dataset from S. cerevisiae by constructing a new set of putative interactions extracted from Gene Ontology (GO- driven annotations in the Saccharomyces Genome Database. Different classification techniques: Simple Naive Bayesian (SNB, Multilayer Perceptron (MLP and K-Nearest Neighbors (KNN were evaluated. Relatively simple classification methods (i.e. less computing intensive and mathematically complex, such as SNB, have been proven to be proficient at predicting PPI. SNB produced the “highest” predictive quality obtaining an area under Receiver Operating Characteristic (ROC curve (AUC value of 0.99. The lowest AUC value of 0.90 was obtained by the KNN classifier. This assessment also demonstrates the strong predictive power of GO-driven models, which offered predictive performance above 0.90 using the different machine learning and statistical techniques. As the predictive power of single-source datasets became weaker MLP and SNB performed
Tejedor, E.; Saz, M. A.; Esper, J.; Cuadrat, J. M.; de Luis, M.
2017-08-01
Drought recurrence in the Mediterranean is regarded as a fundamental factor for socioeconomic development and the resilience of natural systems in context of global change. However, knowledge of past droughts has been hampered by the absence of high-resolution proxies. We present a drought reconstruction for the northeast of the Iberian Peninsula based on a new dendrochronology network considering the Standardized Evapotranspiration Precipitation Index (SPEI). A total of 774 latewood width series from 387 trees of P. sylvestris and P. uncinata was combined in an interregional chronology. The new chronology, calibrated against gridded climate data, reveals a robust relationship with the SPEI representing drought conditions of July and August. We developed a summer drought reconstruction for the period 1734-2013 representative for the northeastern and central Iberian Peninsula. We identified 16 extremely dry and 17 extremely wet summers and four decadal scale dry and wet periods, including 2003-2013 as the driest episode of the reconstruction.
A Neural Network Approach to Infer Optical Depth of Thick Ice Clouds at Night
Minnis, P.; Hong, G.; Sun-Mack, S.; Chen, Yan; Smith, W. L., Jr.
2016-01-01
One of the roadblocks to continuously monitoring cloud properties is the tendency of clouds to become optically black at cloud optical depths (COD) of 6 or less. This constraint dramatically reduces the quantitative information content at night. A recent study found that because of their diffuse nature, ice clouds remain optically gray, to some extent, up to COD of 100 at certain wavelengths. Taking advantage of this weak dependency and the availability of COD retrievals from CloudSat, an artificial neural network algorithm was developed to estimate COD values up to 70 from common satellite imager infrared channels. The method was trained using matched 2007 CloudSat and Aqua MODIS data and is tested using similar data from 2008. The results show a significant improvement over the use of default values at night with high correlation. This paper summarizes the results and suggests paths for future improvement.
Functional equivalency inferred from "authoritative sources" in networks of homologous proteins.
Natarajan, Shreedhar; Jakobsson, Eric
2009-06-12
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods.
Directory of Open Access Journals (Sweden)
Xin Wang
Full Text Available Combinatorial gene perturbations provide rich information for a systematic exploration of genetic interactions. Despite successful applications to bacteria and yeast, the scalability of this approach remains a major challenge for higher organisms such as humans. Here, we report a novel experimental and computational framework to efficiently address this challenge by limiting the 'search space' for important genetic interactions. We propose to integrate rich phenotypes of multiple single gene perturbations to robustly predict functional modules, which can subsequently be subjected to further experimental investigations such as combinatorial gene silencing. We present posterior association networks (PANs to predict functional interactions between genes estimated using a Bayesian mixture modelling approach. The major advantage of this approach over conventional hypothesis tests is that prior knowledge can be incorporated to enhance predictive power. We demonstrate in a simulation study and on biological data, that integrating complementary information greatly improves prediction accuracy. To search for significant modules, we perform hierarchical clustering with multiscale bootstrap resampling. We demonstrate the power of the proposed methodologies in applications to Ewing's sarcoma and human adult stem cells using publicly available and custom generated data, respectively. In the former application, we identify a gene module including many confirmed and highly promising therapeutic targets. Genes in the module are also significantly overrepresented in signalling pathways that are known to be critical for proliferation of Ewing's sarcoma cells. In the latter application, we predict a functional network of chromatin factors controlling epidermal stem cell fate. Further examinations using ChIP-seq, ChIP-qPCR and RT-qPCR reveal that the basis of their genetic interactions may arise from transcriptional cross regulation. A Bioconductor package
Directory of Open Access Journals (Sweden)
Tingting Zheng
2017-09-01
Full Text Available Background: A range of computational methods that rely on the analysis of genome-wide expression datasets have been developed and successfully used for drug repositioning. The success of these methods is based on the hypothesis that introducing a factor (in this case, a drug molecule that could reverse the disease gene expression signature will lead to a therapeutic effect. However, it has also been shown that globally reversing the disease expression signature is not a prerequisite for drug activity. On the other hand, the basic idea of significant anti-correlation in expression profiles could have great value for establishing diet-disease associations and could provide new insights into the role of dietary interventions in disease.Methods: We performed an integrated analysis of publicly available gene expression profiles for foods, diseases and drugs, by calculating pairwise similarity scores for diet and disease gene expression signatures and characterizing their topological features in protein-protein interaction networks.Results: We identified 485 diet-disease pairs where diet could positively influence disease development and 472 pairs where specific diets should be avoided in a disease state. Multiple evidence suggests that orange, whey and coconut fat could be beneficial for psoriasis, lung adenocarcinoma and macular degeneration, respectively. On the other hand, fructose-rich diet should be restricted in patients with chronic intermittent hypoxia and ovarian cancer. Since humans normally do not consume foods in isolation, we also applied different algorithms to predict synergism; as a result, 58 food pairs were predicted. Interestingly, the diets identified as anti-correlated with diseases showed a topological proximity to the disease proteins similar to that of the corresponding drugs.Conclusions: In conclusion, we provide a computational framework for establishing diet-disease associations and additional information on the role of
International Nuclear Information System (INIS)
Sasaki, Masao S.; Tachibana, Akira; Takeda, Shunichi
2014-01-01
Cancer risk at low doses of ionizing radiation remains poorly defined because of ambiguity in the quantitative link to doses below 0.2 Sv in atomic bomb survivors in Hiroshima and Nagasaki arising from limitations in the statistical power and information available on overall radiation dose. To deal with these difficulties, a novel nonparametric statistics based on the ‘integrate-and-fire’ algorithm of artificial neural networks was developed and tested in cancer databases established by the Radiation Effects Research Foundation. The analysis revealed unique features at low doses that could not be accounted for by nominal exposure dose, including (1) the presence of a threshold that varied with organ, gender and age at exposure, and (2) a small but significant bumping increase in cancer risk at low doses in Nagasaki that probably reflects internal exposure to 239 Pu. The threshold was distinct from the canonical definition of zero effect in that it was manifested as negative excess relative risk, or suppression of background cancer rates. Such a unique tissue response at low doses of radiation exposure has been implicated in the context of the molecular basis of radiation–environment interplay in favor of recently emerging experimental evidence on DNA double-strand break repair pathway choice and its epigenetic memory by histone marking. (author)
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite
Directory of Open Access Journals (Sweden)
Hue-Yu Wang
Full Text Available BACKGROUND: An adaptive-network-based fuzzy inference system (ANFIS was compared with an artificial neural network (ANN in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C, pH level (5.5 to 7.5, sodium chloride level (0.25% to 6.25% and sodium nitrite level (0 to 200 ppm on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. METHODS: THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED BY COMPARING THEIR PREDICTION RESULTS WITH ACTUAL DATA: mean absolute percentage error (MAPE, root mean square error (RMSE, standard error of prediction percentage (SEP, bias factor (Bf, accuracy factor (Af, and absolute fraction of variance (R (2. Graphical plots were also used for model comparison. CONCLUSIONS: The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions.
Directory of Open Access Journals (Sweden)
Ja’fari A.
2014-01-01
Full Text Available Image logs provide useful information for fracture study in naturally fractured reservoir. Fracture dip, azimuth, aperture and fracture density can be obtained from image logs and have great importance in naturally fractured reservoir characterization. Imaging all fractured parts of hydrocarbon reservoirs and interpreting the results is expensive and time consuming. In this study, an improved method to make a quantitative correlation between fracture densities obtained from image logs and conventional well log data by integration of different artificial intelligence systems was proposed. The proposed method combines the results of Adaptive Neuro-Fuzzy Inference System (ANFIS and Neural Networks (NN algorithms for overall estimation of fracture density from conventional well log data. A simple averaging method was used to obtain a better result by combining results of ANFIS and NN. The algorithm applied on other wells of the field to obtain fracture density. In order to model the fracture density in the reservoir, we used variography and sequential simulation algorithms like Sequential Indicator Simulation (SIS and Truncated Gaussian Simulation (TGS. The overall algorithm applied to Asmari reservoir one of the SW Iranian oil fields. Histogram analysis applied to control the quality of the obtained models. Results of this study show that for higher number of fracture facies the TGS algorithm works better than SIS but in small number of fracture facies both algorithms provide approximately same results.
Directory of Open Access Journals (Sweden)
Jing Li
2017-01-01
Full Text Available The goal of this study is to improve thermal comfort and indoor air quality with the adaptive network-based fuzzy inference system (ANFIS model and improved particle swarm optimization (PSO algorithm. A method to optimize air conditioning parameters and installation distance is proposed. The methodology is demonstrated through a prototype case, which corresponds to a typical laboratory in colleges and universities. A laboratory model is established, and simulated flow field information is obtained with the CFD software. Subsequently, the ANFIS model is employed instead of the CFD model to predict indoor flow parameters, and the CFD database is utilized to train ANN input-output “metamodels” for the subsequent optimization. With the improved PSO algorithm and the stratified sequence method, the objective functions are optimized. The functions comprise PMV, PPD, and mean age of air. The optimal installation distance is determined with the hemisphere model. Results show that most of the staff obtain a satisfactory degree of thermal comfort and that the proposed method can significantly reduce the cost of building an experimental device. The proposed methodology can be used to determine appropriate air supply parameters and air conditioner installation position for a pleasant and healthy indoor environment.
Energy Technology Data Exchange (ETDEWEB)
Metin Ertunc, H. [Department of Mechatronics Engineering, Kocaeli University, Umuttepe, 41380 Kocaeli (Turkey); Hosoz, Murat [Department of Mechanical Education, Kocaeli University, Umuttepe, 41380 Kocaeli (Turkey)
2008-12-15
This study deals with predicting the performance of an evaporative condenser using both artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) techniques. For this aim, an experimental evaporative condenser consisting of a copper tube condensing coil along with air and water circuit elements was developed and equipped with instruments used for temperature, pressure and flow rate measurements. After the condenser was connected to an R134a vapour-compression refrigeration circuit, it was operated at steady state conditions, while varying both dry and wet bulb temperatures of the air stream entering the condenser, air and water flow rates as well as pressure, temperature and flow rate of the entering refrigerant. Using some of the experimental data for training, ANN and ANFIS models for the evaporative condenser were developed. These models were used for predicting the condenser heat rejection rate, refrigerant temperature leaving the condenser along with dry and wet bulb temperatures of the leaving air stream. Although it was observed that both ANN and ANFIS models yielded a good statistical prediction performance in terms of correlation coefficient, mean relative error, root mean square error and absolute fraction of variance, the accuracies of ANFIS predictions were usually slightly better than those of ANN predictions. This study reveals that, having an extended prediction capability compared to ANN, the ANFIS technique can also be used for predicting the performance of evaporative condensers. (author)
Energy Technology Data Exchange (ETDEWEB)
Djukanovic, M.B. [Inst. Nikola Tesla, Belgrade (Yugoslavia). Dept. of Power Systems; Calovic, M.S. [Univ. of Belgrade (Yugoslavia). Dept. of Electrical Engineering; Vesovic, B.V. [Inst. Mihajlo Pupin, Belgrade (Yugoslavia). Dept. of Automatic Control; Sobajic, D.J. [Electric Power Research Inst., Palo Alto, CA (United States)
1997-12-01
This paper presents an attempt of nonlinear, multivariable control of low-head hydropower plants, by using adaptive-network based fuzzy inference system (ANFIS). The new design technique enhances fuzzy controllers with self-learning capability for achieving prescribed control objectives in a near optimal manner. The controller has flexibility for accepting more sensory information, with the main goal to improve the generator unit transients, by adjusting the exciter input, the wicket gate and runner blade positions. The developed ANFIS controller whose control signals are adjusted by using incomplete on-line measurements, can offer better damping effects to generator oscillations over a wide range of operating conditions, than conventional controllers. Digital simulations of hydropower plant equipped with low-head Kaplan turbine are performed and the comparisons of conventional excitation-governor control, state-feedback optimal control and ANFIS based output feedback control are presented. To demonstrate the effectiveness of the proposed control scheme and the robustness of the acquired neuro-fuzzy controller, the controller has been implemented on a complex high-order non-linear hydrogenerator model.
Dewan, Mohammad W.; Huggett, Daniel J.; Liao, T. Warren; Wahab, Muhammad A.; Okeil, Ayman M.
2015-01-01
Friction-stir-welding (FSW) is a solid-state joining process where joint properties are dependent on welding process parameters. In the current study three critical process parameters including spindle speed (??), plunge force (????), and welding speed (??) are considered key factors in the determination of ultimate tensile strength (UTS) of welded aluminum alloy joints. A total of 73 weld schedules were welded and tensile properties were subsequently obtained experimentally. It is observed that all three process parameters have direct influence on UTS of the welded joints. Utilizing experimental data, an optimized adaptive neuro-fuzzy inference system (ANFIS) model has been developed to predict UTS of FSW joints. A total of 1200 models were developed by varying the number of membership functions (MFs), type of MFs, and combination of four input variables (??,??,????,??????) utilizing a MATLAB platform. Note EFI denotes an empirical force index derived from the three process parameters. For comparison, optimized artificial neural network (ANN) models were also developed to predict UTS from FSW process parameters. By comparing ANFIS and ANN predicted results, it was found that optimized ANFIS models provide better results than ANN. This newly developed best ANFIS model could be utilized for prediction of UTS of FSW joints.
Salehi, Mohammad Reza; Noori, Leila; Abiri, Ebrahim
2016-11-01
In this paper, a subsystem consisting of a microstrip bandpass filter and a microstrip low noise amplifier (LNA) is designed for WLAN applications. The proposed filter has a small implementation area (49 mm2), small insertion loss (0.08 dB) and wide fractional bandwidth (FBW) (61%). To design the proposed LNA, the compact microstrip cells, an field effect transistor, and only a lumped capacitor are used. It has a low supply voltage and a low return loss (-40 dB) at the operation frequency. The matching condition of the proposed subsystem is predicted using subsystem analysis, artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). To design the proposed filter, the transmission matrix of the proposed resonator is obtained and analysed. The performance of the proposed ANN and ANFIS models is tested using the numerical data by four performance measures, namely the correlation coefficient (CC), the mean absolute error (MAE), the average percentage error (APE) and the root mean square error (RMSE). The obtained results show that these models are in good agreement with the numerical data, and a small error between the predicted values and numerical solution is obtained.
Directory of Open Access Journals (Sweden)
Foday Conteh
2017-09-01
Full Text Available In recent years, the use of renewable energy sources in micro-grids has become an effectivemeans of power decentralization especially in remote areas where the extension of the main power gridis an impediment. Despite the huge deposit of natural resources in Africa, the continent still remains inenergy poverty. Majority of the African countries could not meet the electricity demand of their people.Therefore, the power system is prone to frequent black out as a result of either excess load to the systemor generation failure. The imbalance of power generation and load demand has been a major factor inmaintaining the stability of the power systems and is usually responsible for the under frequency andunder voltage in power systems. Currently, load shedding is the most widely used method to balancebetween load and demand in order to prevent the system from collapsing. But the conventional methodof under frequency or under voltage load shedding faces many challenges and may not perform asexpected. This may lead to over shedding or under shedding, causing system blackout or equipmentdamage. To prevent system cascade or equipment damage, appropriate amount of load must beintentionally and automatically curtailed during instability. In this paper, an effective load sheddingtechnique for micro-grids using artificial neural network and adaptive neuro-fuzzy inference system isproposed. The combined techniques take into account the actual system state and the exact amount ofload needs to be curtailed at a faster rate as compared to the conventional method. Also, this methodis able to carry out optimal load shedding for any input range other than the trained data. Simulationresults obtained from this work, corroborate the merit of this algorithm.
IntNetLncSim: an integrative network analysis method to infer human lncRNA functional similarity.
Cheng, Liang; Shi, Hongbo; Wang, Zhenzhen; Hu, Yang; Yang, Haixiu; Zhou, Chen; Sun, Jie; Zhou, Meng
2016-07-26
Increasing evidence indicated that long non-coding RNAs (lncRNAs) were involved in various biological processes and complex diseases by communicating with mRNAs/miRNAs each other. Exploiting interactions between lncRNAs and mRNA/miRNAs to lncRNA functional similarity (LFS) is an effective method to explore function of lncRNAs and predict novel lncRNA-disease associations. In this article, we proposed an integrative framework, IntNetLncSim, to infer LFS by modeling the information flow in an integrated network that comprises both lncRNA-related transcriptional and post-transcriptional information. The performance of IntNetLncSim was evaluated by investigating the relationship of LFS with the similarity of lncRNA-related mRNA sets (LmRSets) and miRNA sets (LmiRSets). As a result, LFS by IntNetLncSim was significant positively correlated with the LmRSet (Pearson correlation γ2=0.8424) and LmiRSet (Pearson correlation γ2=0.2601). Particularly, the performance of IntNetLncSim is superior to several previous methods. In the case of applying the LFS to identify novel lncRNA-disease relationships, we achieved an area under the ROC curve (0.7300) in experimentally verified lncRNA-disease associations based on leave-one-out cross-validation. Furthermore, highly-ranked lncRNA-disease associations confirmed by literature mining demonstrated the excellent performance of IntNetLncSim. Finally, a web-accessible system was provided for querying LFS and potential lncRNA-disease relationships: http://www.bio-bigdata.com/IntNetLncSim.
Diehl, Martin; Groeber, Michael; Haase, Christian; Molodov, Dmitri A.; Roters, Franz; Raabe, Dierk
2017-05-01
Predicting, understanding, and controlling the mechanical behavior is the most important task when designing structural materials. Modern alloy systems—in which multiple deformation mechanisms, phases, and defects are introduced to overcome the inverse strength-ductility relationship—give raise to multiple possibilities for modifying the deformation behavior, rendering traditional, exclusively experimentally-based alloy development workflows inappropriate. For fast and efficient alloy design, it is therefore desirable to predict the mechanical performance of candidate alloys by simulation studies to replace time- and resource-consuming mechanical tests. Simulation tools suitable for this task need to correctly predict the mechanical behavior in dependence of alloy composition, microstructure, texture, phase fractions, and processing history. Here, an integrated computational materials engineering approach based on the open source software packages DREAM.3D and DAMASK (Düsseldorf Advanced Materials Simulation Kit) that enables such virtual material development is presented. More specific, our approach consists of the following three steps: (1) acquire statistical quantities that describe a microstructure, (2) build a representative volume element based on these quantities employing DREAM.3D, and (3) evaluate the representative volume using a predictive crystal plasticity material model provided by DAMASK. Exemplarily, these steps are here conducted for a high-manganese steel.
Serang, Oliver
2015-08-01
Observations depending on sums of random variables are common throughout many fields; however, no efficient solution is currently known for performing max-product inference on these sums of general discrete distributions (max-product inference can be used to obtain maximum a posteriori estimates). The limiting step to max-product inference is the max-convolution problem (sometimes presented in log-transformed form and denoted as "infimal convolution," "min-convolution," or "convolution on the tropical semiring"), for which no O(k log(k)) method is currently known. Presented here is an O(k log(k)) numerical method for estimating the max-convolution of two nonnegative vectors (e.g., two probability mass functions), where k is the length of the larger vector. This numerical max-convolution method is then demonstrated by performing fast max-product inference on a convolution tree, a data structure for performing fast inference given information on the sum of n discrete random variables in O(nk log(nk)log(n)) steps (where each random variable has an arbitrary prior distribution on k contiguous possible states). The numerical max-convolution method can be applied to specialized classes of hidden Markov models to reduce the runtime of computing the Viterbi path from nk(2) to nk log(k), and has potential application to the all-pairs shortest paths problem.
Directory of Open Access Journals (Sweden)
Yungang Xu
Full Text Available Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN, a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max, due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs, in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional
Directory of Open Access Journals (Sweden)
Mohammad Taghi Dastorani
2012-01-01
Full Text Available During recent few decades, due to the importance of the availability of water, and therefore the necesity of predicting run off resulted from rain fall there has been an increase in developing and implementation of new suitable method for prediction of run off using precipitation data. One of these approaches that have been developed in several areas of sciences including water related fields, is soft computing techniques such as artificial neural networks and fuzzy logic systems. This research was designed to evaluate the applicability of artificial neural network and adaptive neuro –fuzzy inference system to model rainfall-runoff process in Zayandeh_rood dam basin. It must be mentioned that, data have been analysed using Wingamma software, to select appropriate type and number of training input data before they can be used in the models. Then, it has been tried to evaluated applicability of artificial neural networks and neuro-fuzzy techniques to predict runoff generated from daily rainfall. Finally, the accuracy of the results produced by these methods has been compared using statistical criterion. Results taken from this research show that artificial neural networks and neuro-fuzzy technique presented different outputs in different conditions in terms of type and number of inputs variables, but both method have been able to produce acceptable results when suitable input variables and network structures are used.
Zhou, Xionghui; Liu, Juan
2014-01-01
Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for
Directory of Open Access Journals (Sweden)
Mika Gustafsson
Full Text Available BACKGROUND: To predict gene expressions is an important endeavour within computational systems biology. It can both be a way to explore how drugs affect the system, as well as providing a framework for finding which genes are interrelated in a certain process. A practical problem, however, is how to assess and discriminate among the various algorithms which have been developed for this purpose. Therefore, the DREAM project invited the year 2008 to a challenge for predicting gene expression values, and here we present the algorithm with best performance. METHODOLOGY/PRINCIPAL FINDINGS: We develop an algorithm by exploring various regression schemes with different model selection procedures. It turns out that the most effective scheme is based on least squares, with a penalty term of a recently developed form called the "elastic net". Key components in the algorithm are the integration of expression data from other experimental conditions than those presented for the challenge and the utilization of transcription factor binding data for guiding the inference process towards known interactions. Of importance is also a cross-validation procedure where each form of external data is used only to the extent it increases the expected performance. CONCLUSIONS/SIGNIFICANCE: Our algorithm proves both the possibility to extract information from large-scale expression data concerning prediction of gene levels, as well as the benefits of integrating different data sources for improving the inference. We believe the former is an important message to those still hesitating on the possibilities for computational approaches, while the latter is part of an important way forward for the future development of the field of computational systems biology.
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.
1995-01-01
The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is
Wu, Mengmeng; Lin, Zhixiang; Ma, Shining; Chen, Ting; Jiang, Rui; Wong, Wing Hung
2017-12-01
Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space.
Mohamed Salleh, Faridah Hani; Arif, Shereena Mohd; Zainudin, Suhaila; Firdaus-Raih, Mohd
2015-12-01
A gene regulatory network (GRN) is a large and complex network consisting of interacting elements that, over time, affect each other's state. The dynamics of complex gene regulatory processes are difficult to understand using intuitive approaches alone. To overcome this problem, we propose an algorithm for inferring the regulatory interactions from knock-out data using a Gaussian model combines with Pearson Correlation Coefficient (PCC). There are several problems relating to GRN construction that have been outlined in this paper. We demonstrated the ability of our proposed method to (1) predict the presence of regulatory interactions between genes, (2) their directionality and (3) their states (activation or suppression). The algorithm was applied to network sizes of 10 and 50 genes from DREAM3 datasets and network sizes of 10 from DREAM4 datasets. The predicted networks were evaluated based on AUROC and AUPR. We discovered that high false positive values were generated by our GRN prediction methods because the indirect regulations have been wrongly predicted as true relationships. We achieved satisfactory results as the majority of sub-networks achieved AUROC values above 0.5. Copyright © 2015 Elsevier Ltd. All rights reserved.
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gao, Zhong-Ke; Yang, Yu-Xuan; Cai, Qing; Zhang, Shan-Shan; Jin, Ning-De
2016-06-01
Exploring the dynamical behaviors of high water cut and low velocity oil-water flows remains a contemporary and challenging problem of significant importance. This challenge stimulates us to design a high-speed cycle motivation conductance sensor to capture spatial local flow information. We systematically carry out experiments and acquire the multi-channel measurements from different oil-water flow patterns. Then we develop a novel multivariate weighted recurrence network for uncovering the flow behaviors from multi-channel measurements. In particular, we exploit graph energy and weighted clustering coefficient in combination with multivariate time-frequency analysis to characterize the derived complex networks. The results indicate that the network measures are very sensitive to the flow transitions and allow uncovering local dynamical behaviors associated with water cut and flow velocity. These properties render our method particularly useful for quantitatively characterizing dynamical behaviors governing the transition and evolution of different oil-water flow patterns.
A relative variation-based method to unraveling gene regulatory networks.
Directory of Open Access Journals (Sweden)
Yali Wang
Full Text Available Gene regulatory network (GRN reconstruction is essential in understanding the functioning and pathology of a biological system. Extensive models and algorithms have been developed to unravel a GRN. The DREAM project aims to clarify both advantages and disadvantages of these methods from an application viewpoint. An interesting yet surprising observation is that compared with complicated methods like those based on nonlinear differential equations, etc., methods based on a simple statistics, such as the so-called Z-score, usually perform better. A fundamental problem with the Z-score, however, is that direct and indirect regulations can not be easily distinguished. To overcome this drawback, a relative expression level variation (RELV based GRN inference algorithm is suggested in this paper, which consists of three major steps. Firstly, on the basis of wild type and single gene knockout/knockdown experimental data, the magnitude of RELV of a gene is estimated. Secondly, probability for the existence of a direct regulation from a perturbed gene to a measured gene is estimated, which is further utilized to estimate whether a gene can be regulated by other genes. Finally, the normalized RELVs are modified to make genes with an estimated zero in-degree have smaller RELVs in magnitude than the other genes, which is used afterwards in queuing possibilities of the existence of direct regulations among genes and therefore leads to an estimate on the GRN topology. This method can in principle avoid the so-called cascade errors under certain situations. Computational results with the Size 100 sub-challenges of DREAM3 and DREAM4 show that, compared with the Z-score based method, prediction performances can be substantially improved, especially the AUPR specification. Moreover, it can even outperform the best team of both DREAM3 and DREAM4. Furthermore, the high precision of the obtained most reliable predictions shows that the suggested algorithm may be
NetCooperate: a network-based tool for inferring host-microbe and microbe-microbe cooperation
Levy, Roie; Carr, Rogan; Kreimer, Anat; Freilich, Shiri; Borenstein, Elhanan
2015-01-01
Background Host-microbe and microbe-microbe interactions are often governed by the complex exchange of metabolites. Such interactions play a key role in determining the way pathogenic and commensal species impact their host and in the assembly of complex microbial communities. Recently, several studies have demonstrated how such interactions are reflected in the organization of the metabolic networks of the interacting species, and introduced various graph theory-based methods to predict host...
NetCooperate: a network-based tool for inferring host-microbe and microbe-microbe cooperation.
Levy, Roie; Carr, Rogan; Kreimer, Anat; Freilich, Shiri; Borenstein, Elhanan
2015-05-17
Host-microbe and microbe-microbe interactions are often governed by the complex exchange of metabolites. Such interactions play a key role in determining the way pathogenic and commensal species impact their host and in the assembly of complex microbial communities. Recently, several studies have demonstrated how such interactions are reflected in the organization of the metabolic networks of the interacting species, and introduced various graph theory-based methods to predict host-microbe and microbe-microbe interactions directly from network topology. Using these methods, such studies have revealed evolutionary and ecological processes that shape species interactions and community assembly, highlighting the potential of this reverse-ecology research paradigm. NetCooperate is a web-based tool and a software package for determining host-microbe and microbe-microbe cooperative potential. It specifically calculates two previously developed and validated metrics for species interaction: the Biosynthetic Support Score which quantifies the ability of a host species to supply the nutritional requirements of a parasitic or a commensal species, and the Metabolic Complementarity Index which quantifies the complementarity of a pair of microbial organisms' niches. NetCooperate takes as input a pair of metabolic networks, and returns the pairwise metrics as well as a list of potential syntrophic metabolic compounds. The Biosynthetic Support Score and Metabolic Complementarity Index provide insight into host-microbe and microbe-microbe metabolic interactions. NetCooperate determines these interaction indices from metabolic network topology, and can be used for small- or large-scale analyses. NetCooperate is provided as both a web-based tool and an open-source Python module; both are freely available online at http://elbo.gs.washington.edu/software_netcooperate.html.
An approach for reduction of false predictions in reverse engineering of gene regulatory networks.
Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar
2018-05-14
A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives
Gomez-Velez, J. D.; Harvey, J. W.
2014-12-01
Hyporheic exchange has been hypothesized to have basin-scale consequences; however, predictions throughout river networks are limited by available geomorphic and hydrogeologic data as well as models that can analyze and aggregate hyporheic exchange flows across large spatial scales. We developed a parsimonious but physically-based model of hyporheic flow for application in large river basins: Networks with EXchange and Subsurface Storage (NEXSS). At the core of NEXSS is a characterization of the channel geometry, geomorphic features, and related hydraulic drivers based on scaling equations from the literature and readily accessible information such as river discharge, bankfull width, median grain size, sinuosity, channel slope, and regional groundwater gradients. Multi-scale hyporheic flow is computed based on combining simple but powerful analytical and numerical expressions that have been previously published. We applied NEXSS across a broad range of geomorphic diversity in river reaches and synthetic river networks. NEXSS demonstrates that vertical exchange beneath submerged bedforms dominates hyporheic fluxes and turnover rates along the river corridor. Moreover, the hyporheic zone's potential for biogeochemical transformations is comparable across stream orders, but the abundance of lower-order channels results in a considerably higher cumulative effect for low-order streams. Thus, vertical exchange beneath submerged bedforms has more potential for biogeochemical transformations than lateral exchange beneath banks, although lateral exchange through meanders may be important in large rivers. These results have implications for predicting outcomes of river and basin management practices.
Directory of Open Access Journals (Sweden)
Parvin Jeffrey
2010-12-01
Full Text Available Abstract Background Global profiling of in vivo protein-DNA interactions using ChIP-based technologies has evolved rapidly in recent years. Although many genome-wide studies have identified thousands of ERα binding sites and have revealed the associated transcription factor (TF partners, such as AP1, FOXA1 and CEBP, little is known about ERα associated hierarchical transcriptional regulatory networks. Results In this study, we applied computational approaches to analyze three public available ChIP-based datasets: ChIP-seq, ChIP-PET and ChIP-chip, and to investigate the hierarchical regulatory network for ERα and ERα partner TFs regulation in estrogen-dependent breast cancer MCF7 cells. 16 common TFs and two common new TF partners (RORA and PITX2 were found among ChIP-seq, ChIP-chip and ChIP-PET datasets. The regulatory networks were constructed by scanning the ChIP-peak region with TF specific position weight matrix (PWM. A permutation test was performed to test the reliability of each connection of the network. We then used DREM software to perform gene ontology function analysis on the common genes. We found that FOS, PITX2, RORA and FOXA1 were involved in the up-regulated genes. We also conducted the ERα and Pol-II ChIP-seq experiments in tamoxifen resistance MCF7 cells (denoted as MCF7-T in this study and compared the difference between MCF7 and MCF7-T cells. The result showed very little overlap between these two cells in terms of targeted genes (21.2% of common genes and targeted TFs (25% of common TFs. The significant dissimilarity may indicate totally different transcriptional regulatory mechanisms between these two cancer cells. Conclusions Our study uncovers new estrogen-mediated regulatory networks by mining three ChIP-based data in MCF7 cells and ChIP-seq data in MCF7-T cells. We compared the different ChIP-based technologies as well as different breast cancer cells. Our computational analytical approach may guide biologists to
Garmroodi Asil, A.; Nakhaei Pour, A.; Mirzaei, Sh.
2018-04-01
In the present article, generalization performances of regularization network (RN) and optimize adaptive neuro-fuzzy inference system (ANFIS) are compared with a conventional software for prediction of heat transfer coefficient (HTC) as a function of superficial gas velocity (5-25 cm/s) and solid fraction (0-40 wt%) at different axial and radial locations. The networks were trained by resorting several sets of experimental data collected from a specific system of air/hydrocarbon liquid phase/silica particle in a slurry bubble column reactor (SBCR). A special convection HTC measurement probe was manufactured and positioned in an axial distance of 40 and 130 cm above the sparger at center and near the wall of SBCR. The simulation results show that both in-house RN and optimized ANFIS due to powerful noise filtering capabilities provide superior performances compared to the conventional software of MATLAB ANFIS and ANN toolbox. For the case of 40 and 130 cm axial distance from center of sparger, at constant superficial gas velocity of 25 cm/s, adding 40 wt% silica particles to liquid phase leads to about 66% and 69% increasing in HTC respectively. The HTC in the column center for all the cases studied are about 9-14% larger than those near the wall region.
Bae, Jonghoon; Cha, Young-Jae; Lee, Hyungsuk; Lee, Boyun; Baek, Sojung; Choi, Semin
2017-01-01
This study examines whether the way that a person makes inferences about unknown events is associated with his or her social relations, more precisely, those characterized by ego network density that reflects the structure of a person’s immediate social relation. From the analysis of individual predictions over the Go match between AlphaGo and Sedol Lee in March 2016 in Seoul, Korea, this study shows that the low-density group scored higher than the high-density group in the accuracy of the prediction over a future state of a social event, i.e., the outcome of the first game. We corroborated this finding with three replication tests that asked the participants to predict the following: film awards, President Park’s impeachment in Korea, and the counterfactual assessment of the US presidential election. Taken together, this study suggests that network density is negatively associated with vision advantage, i.e., the ability to discover and forecast an unknown aspect of a social event. PMID:28222114
Mohd Yunos, Zuriahati; Shamsuddin, Siti Mariyam; Ismail, Noriszura; Sallehuddin, Roselina
2013-04-01
Artificial neural network (ANN) with back propagation algorithm (BP) and ANFIS was chosen as an alternative technique in modeling motor insurance claims. In particular, an ANN and ANFIS technique is applied to model and forecast the Malaysian motor insurance data which is categorized into four claim types; third party property damage (TPPD), third party bodily injury (TPBI), own damage (OD) and theft. This study is to determine whether an ANN and ANFIS model is capable of accurately predicting motor insurance claim. There were changes made to the network structure as the number of input nodes, number of hidden nodes and pre-processing techniques are also examined and a cross-validation technique is used to improve the generalization ability of ANN and ANFIS models. Based on the empirical studies, the prediction performance of the ANN and ANFIS model is improved by using different number of input nodes and hidden nodes; and also various sizes of data. The experimental results reveal that the ANFIS model has outperformed the ANN model. Both models are capable of producing a reliable prediction for the Malaysian motor insurance claims and hence, the proposed method can be applied as an alternative to predict claim frequency and claim severity.
Directory of Open Access Journals (Sweden)
Zohdy Sarah
2012-03-01
provided insight into the previously unseen parasite movement between lemurs, but also allowed us to infer social interactions between them. As lice are known pathogen vectors, our method also allowed us to identify the lemurs most likely to facilitate louse-mediated epidemics. Our approach demonstrates the potential to uncover otherwise inaccessible parasite-host, and host social interaction data in any trappable species parasitized by sucking lice.
Strakova, Eva; Zikova, Alice; Vohradsky, Jiri
2014-01-01
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
Gaebler, P. J.; Ceranna, L.
2016-12-01
All nuclear explosions - on the Earth's surface, underground, underwater or in the atmosphere - are banned by the Comprehensive Nuclear-Test-Ban Treaty (CTBT). As part of this treaty, a verification regime was put into place to detect, locate and characterize nuclear explosion testings at any time, by anyone and everywhere on the Earth. The International Monitoring System (IMS) plays a key role in the verification regime of the CTBT. Out of the different monitoring techniques used in the IMS, the seismic waveform approach is the most effective technology for monitoring nuclear underground testing and to identify and characterize potential nuclear events. This study introduces a method of seismic threshold monitoring to assess an upper magnitude limit of a potential seismic event in a certain given geographical region. The method is based on ambient seismic background noise measurements at the individual IMS seismic stations as well as on global distance correction terms for body wave magnitudes, which are calculated using the seismic reflectivity method. From our investigations we conclude that a global detection threshold of around mb 4.0 can be achieved using only stations from the primary seismic network, a clear latitudinal dependence for the detection thresholdcan be observed between northern and southern hemisphere. Including the seismic stations being part of the auxiliary seismic IMS network results in a slight improvement of global detection capability. However, including wave arrivals from distances greater than 120 degrees, mainly PKP-wave arrivals, leads to a significant improvement in average global detection capability. In special this leads to an improvement of the detection threshold on the southern hemisphere. We further investigate the dependence of the detection capability on spatial (latitude and longitude) and temporal (time) parameters, as well as on parameters such as source type and percentage of operational IMS stations.
Goren, Liran; Castelltort, Sébastien; Klinger, Yann
2016-04-01
Partitioning of horizontal deformation between localized and distributed modes in regions of oblique tectonic convergence is, in many cases, hard to quantify. As a case study, we consider the Dead Sea Fault System that changes its orientation across Lebanon and forms a restraining bend. The oblique deformation along the Lebanese restraining bend is characterized by a complex suite of tectonic structures, among which, the Yammouneh fault, is believed to be the main strand that relays deformation from the southern section to the northern section of the Dead Sea Fault System. However, uncertainties regarding slip rates along the Yammouneh fault and strain partitioning in Lebanon still prevail. In the current work we use the geometry and topology of river basins together with numerical modeling to evaluate modes and rates of the horizontal deformation in Mount Lebanon that is associated with the Arabia-Sinai relative plate motion. We focus on river basins that drain Mount Lebanon to the Mediterranean and originate close to the Yammouneh fault. We quantify a systematic counterclockwise rotation of these basins and evaluate drainage area disequilibrium using an application of the χ mapping technique, which aims at estimating the degree of geometrical and topological disequilibrium in river networks. The analysis indicates a systematic spatial pattern whereby tributaries of the rotated basins appear to experience drainage area loss or gain with respect to channel length. A kinematic model that is informed by river basin geometry reveals that since the late Miocene, about a quarter of the relative plate motion parallel to the plate boundary has been distributed along a wide band of deformation to the west of the Yammouneh fault. Taken together with previous, shorter-term estimates, the model indicates little variation of slip rate along the Yammouneh fault since the late Miocene. Kinematic model results are compatible with late Miocene paleomagnetic rotations in western
González, D. L., II; Angus, M. P.; Tetteh, I. K.; Bello, G. A.; Padmanabhan, K.; Pendse, S. V.; Srinivas, S.; Yu, J.; Semazzi, F.; Kumar, V.; Samatova, N. F.
2015-01-01
Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall~variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression, and dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall. These relationships fall into two categories: well-known associations from prior climate knowledge, such as the relationship with the El Niño-Southern Oscillation (ENSO) and putative links, such as North Atlantic Oscillation, that invite further research.
Chandler, D. G.; Yang, X.; Steward, D. R.; Gido, K.
2007-12-01
Stream networks in the Great Plains integrate fluxes from precipitation as surface runoff in discrete events and groundwater as base flow. Changes in land cover and agronomic practices and development of ground water resources to support irrigated agriculture have resulted in profound changes in the occurrence and magnitude of stream flows, especially near the Ogallala aquifer, where precipitation is low. These changes have demonstrably altered the aquatic habitat of western Kansas, with documented changes in fish populations, riparian communities and groundwater quality due to stream transmission losses. Forecasting future changes in aquatic and riparian ecology and groundwater quality requires a large scale spatially explicit model of groundwater- surface water interaction. In this study, we combine historical data on land use, stream flow, production well development and groundwater level observations with groundwater elevation modeling to support a geospatial framework for assessing changes in refugia for aquatic species in four rivers in western Kansas between 1965 and 2005. Decreased frequency and duration of streamflow occurred in all rivers, but the extent of change depended on the geomorphology of the river basin and the extent of groundwater development. In the absence of streamflow, refugia for aquatic species were defined as the stream reaches below the phreatic surface of the regional aquifer. Changes in extent, location and degree of fragmentation of gaining reaches was found to be a strong predictor of surface water occurrence during drought and a robust hydrological template for the analysis of changes in recharge to alluvial and regional aquifers and riparian and aquatic habitat.
Rohatgi, Vijay K
2003-01-01
Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth
Directory of Open Access Journals (Sweden)
Chien-Lin Huang
2015-11-01
Full Text Available This study applies Real-Time Recurrent Learning Neural Network (RTRLNN and Adaptive Network-based Fuzzy Inference System (ANFIS with novel heuristic techniques to develop an advanced prediction model of accumulated total inflow of a reservoir in order to solve the difficulties of future long lead-time highly varied uncertainty during typhoon attacks while using a real-time forecast. For promoting the temporal-spatial forecasted precision, the following original specialized heuristic inputs were coupled: observed-predicted inflow increase/decrease (OPIID rate, total precipitation, and duration from current time to the time of maximum precipitation and direct runoff ending (DRE. This study also investigated the temporal-spatial forecasted error feature to assess the feasibility of the developed models, and analyzed the output sensitivity of both single and combined heuristic inputs to determine whether the heuristic model is susceptible to the impact of future forecasted uncertainty/errors. Validation results showed that the long lead-time–predicted accuracy and stability of the RTRLNN-based accumulated total inflow model are better than that of the ANFIS-based model because of the real-time recurrent deterministic routing mechanism of RTRLNN. Simulations show that the RTRLNN-based model with coupled heuristic inputs (RTRLNN-CHI, average error percentage (AEP/average forecast lead-time (AFLT: 6.3%/49 h can achieve better prediction than the model with non-heuristic inputs (AEP of RTRLNN-NHI and ANFIS-NHI: 15.2%/31.8% because of the full consideration of real-time hydrological initial/boundary conditions. Besides, the RTRLNN-CHI model can promote the forecasted lead-time above 49 h with less than 10% of AEP which can overcome the previous forecasted limits of 6-h AFLT with above 20%–40% of AEP.
Gerstein, Mark
2016-01-01
Gene expression is controlled by the combinatorial effects of regulatory factors from different biological subsystems such as general transcription factors (TFs), cellular growth factors and microRNAs. A subsystem’s gene expression may be controlled by its internal regulatory factors, exclusively, or by external subsystems, or by both. It is thus useful to distinguish the degree to which a subsystem is regulated internally or externally–e.g., how non-conserved, species-specific TFs affect the expression of conserved, cross-species genes during evolution. We developed a computational method (DREISS, dreiss.gerteinlab.org) for analyzing the Dynamics of gene expression driven by Regulatory networks, both External and Internal based on State Space models. Given a subsystem, the “state” and “control” in the model refer to its own (internal) and another subsystem’s (external) gene expression levels. The state at a given time is determined by the state and control at a previous time. Because typical time-series data do not have enough samples to fully estimate the model’s parameters, DREISS uses dimensionality reduction, and identifies canonical temporal expression trajectories (e.g., degradation, growth and oscillation) representing the regulatory effects emanating from various subsystems. To demonstrate capabilities of DREISS, we study the regulatory effects of evolutionarily conserved vs. divergent TFs across distant species. In particular, we applied DREISS to the time-series gene expression datasets of C. elegans and D. melanogaster during their embryonic development. We analyzed the expression dynamics of the conserved, orthologous genes (orthologs), seeing the degree to which these can be accounted for by orthologous (internal) versus species-specific (external) TFs. We found that between two species, the orthologs have matched, internally driven expression patterns but very different externally driven ones. This is particularly true for genes with
Directory of Open Access Journals (Sweden)
Daifeng Wang
2016-10-01
Full Text Available Gene expression is controlled by the combinatorial effects of regulatory factors from different biological subsystems such as general transcription factors (TFs, cellular growth factors and microRNAs. A subsystem's gene expression may be controlled by its internal regulatory factors, exclusively, or by external subsystems, or by both. It is thus useful to distinguish the degree to which a subsystem is regulated internally or externally-e.g., how non-conserved, species-specific TFs affect the expression of conserved, cross-species genes during evolution. We developed a computational method (DREISS, dreiss.gerteinlab.org for analyzing the Dynamics of gene expression driven by Regulatory networks, both External and Internal based on State Space models. Given a subsystem, the "state" and "control" in the model refer to its own (internal and another subsystem's (external gene expression levels. The state at a given time is determined by the state and control at a previous time. Because typical time-series data do not have enough samples to fully estimate the model's parameters, DREISS uses dimensionality reduction, and identifies canonical temporal expression trajectories (e.g., degradation, growth and oscillation representing the regulatory effects emanating from various subsystems. To demonstrate capabilities of DREISS, we study the regulatory effects of evolutionarily conserved vs. divergent TFs across distant species. In particular, we applied DREISS to the time-series gene expression datasets of C. elegans and D. melanogaster during their embryonic development. We analyzed the expression dynamics of the conserved, orthologous genes (orthologs, seeing the degree to which these can be accounted for by orthologous (internal versus species-specific (external TFs. We found that between two species, the orthologs have matched, internally driven expression patterns but very different externally driven ones. This is particularly true for genes with
Energy Technology Data Exchange (ETDEWEB)
Crevoisier, Cyril; Gloor, Manuel; Gloaguen, Erwan; Sarmiento, Jorge L. [Princeton Univ., NJ (United States). Atmospheric and Oceanic Sciences; Horowitz, Larry W. [Princeton Univ., NJ (United States). Geophysical Fluid Dynamics Laboratory; Sweeney, Colm; Tans, Pieter P. [NOAA/ESRL Global Monitoring Division, Boulder, CO (United States)
2006-11-15
In order to exploit the upcoming regular measurements of vertical carbon dioxide (CO{sub 2}) profiles over North America implemented in the framework of the North American Carbon Program (NACP), we design a direct carbon budgeting approach to infer carbon sources and sinks over the continent using model simulations. Direct budgeting puts a control volume on top of North America, balances air mass in- and outflows into the volume and solves for the surface fluxes. The flows are derived from the observations through a geostatistical interpolation technique called Kriging combined with transport fields from weather analysis. The use of CO{sub 2} vertical profiles simulated by the atmospheric transport model MOZART-2 at the planned 19 stations of the NACP network has given an estimation of the error of 0.39 GtC/yr within the model world. Reducing this error may be achieved through a better estimation of mass fluxes associated with convective processes affecting North America. Complementary stations in the north-west and the north-east are also needed to resolve the variability of CO{sub 2} in these regions. For instance, the addition of a single station near 52 deg N; 110 deg W is shown to decrease the estimation error to 0.34 GtC/yr.
International Nuclear Information System (INIS)
Crevoisier, Cyril; Gloor, Manuel; Gloaguen, Erwan; Sarmiento, Jorge L.
2006-01-01
In order to exploit the upcoming regular measurements of vertical carbon dioxide (CO 2 ) profiles over North America implemented in the framework of the North American Carbon Program (NACP), we design a direct carbon budgeting approach to infer carbon sources and sinks over the continent using model simulations. Direct budgeting puts a control volume on top of North America, balances air mass in- and outflows into the volume and solves for the surface fluxes. The flows are derived from the observations through a geostatistical interpolation technique called Kriging combined with transport fields from weather analysis. The use of CO 2 vertical profiles simulated by the atmospheric transport model MOZART-2 at the planned 19 stations of the NACP network has given an estimation of the error of 0.39 GtC/yr within the model world. Reducing this error may be achieved through a better estimation of mass fluxes associated with convective processes affecting North America. Complementary stations in the north-west and the north-east are also needed to resolve the variability of CO 2 in these regions. For instance, the addition of a single station near 52 deg N; 110 deg W is shown to decrease the estimation error to 0.34 GtC/yr
Directory of Open Access Journals (Sweden)
Chien-Lin Huang
2015-01-01
Full Text Available This study aims to construct a typhoon precipitation forecast model providing forecasts one to six hours in advance using optimal model parameters and structures retrieved from a combination of the adaptive network-based fuzzy inference system (ANFIS and artificial intelligence. To enhance the accuracy of the precipitation forecast, two structures were then used to establish the precipitation forecast model for a specific lead-time: a single-model structure and a dual-model hybrid structure where the forecast models of higher and lower precipitation were integrated. In order to rapidly, automatically, and accurately retrieve the optimal parameters and structures of the ANFIS-based precipitation forecast model, a tabu search was applied to identify the adjacent radius in subtractive clustering when constructing the ANFIS structure. The coupled structure was also employed to establish a precipitation forecast model across short and long lead-times in order to improve the accuracy of long-term precipitation forecasts. The study area is the Shimen Reservoir, and the analyzed period is from 2001 to 2009. Results showed that the optimal initial ANFIS parameters selected by the tabu search, combined with the dual-model hybrid method and the coupled structure, provided the favors in computation efficiency and high-reliability predictions in typhoon precipitation forecasts regarding short to long lead-time forecasting horizons.
Directory of Open Access Journals (Sweden)
Zhongrong Zhang
2016-01-01
Full Text Available Wind energy has increasingly played a vital role in mitigating conventional resource shortages. Nevertheless, the stochastic nature of wind poses a great challenge when attempting to find an accurate forecasting model for wind power. Therefore, precise wind power forecasts are of primary importance to solve operational, planning and economic problems in the growing wind power scenario. Previous research has focused efforts on the deterministic forecast of wind power values, but less attention has been paid to providing information about wind energy. Based on an optimal Adaptive-Network-Based Fuzzy Inference System (ANFIS and Singular Spectrum Analysis (SSA, this paper develops a hybrid uncertainty forecasting model, IFASF (Interval Forecast-ANFIS-SSA-Firefly Alogorithm, to obtain the upper and lower bounds of daily average wind power, which is beneficial for the practical operation of both the grid company and independent power producers. To strengthen the practical ability of this developed model, this paper presents a comparison between IFASF and other benchmarks, which provides a general reference for this aspect for statistical or artificially intelligent interval forecast methods. The comparison results show that the developed model outperforms eight benchmarks and has a satisfactory forecasting effectiveness in three different wind farms with two time horizons.
Causal inference in econometrics
Kreinovich, Vladik; Sriboonchitta, Songsak
2016-01-01
This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.
Asnaashari, Maryam; Farhoosh, Reza; Farahmandfar, Reza
2016-10-01
As a result of concerns regarding possible health hazards of synthetic antioxidants, gallic acid and methyl gallate may be introduced as natural antioxidants to improve oxidative stability of marine oil. Since conventional modelling could not predict the oxidative parameters precisely, artificial neural network (ANN) and neuro-fuzzy inference system (ANFIS) modelling with three inputs, including type of antioxidant (gallic acid and methyl gallate), temperature (35, 45 and 55 °C) and concentration (0, 200, 400, 800 and 1600 mg L(-1) ) and four outputs containing induction period (IP), slope of initial stage of oxidation curve (k1 ) and slope of propagation stage of oxidation curve (k2 ) and peroxide value at the IP (PVIP ) were performed to predict the oxidation parameters of Kilka oil triacylglycerols and were compared to multiple linear regression (MLR). The results showed ANFIS was the best model with high coefficient of determination (R(2) = 0.99, 0.99, 0.92 and 0.77 for IP, k1 , k2 and PVIP , respectively). So, the RMSE and MAE values for IP were 7.49 and 4.92 in ANFIS model. However, they were to be 15.95 and 10.88 and 34.14 and 3.60 for the best MLP structure and MLR, respectively. So, MLR showed the minimum accuracy among the constructed models. Sensitivity analysis based on the ANFIS model suggested a high sensitivity of oxidation parameters, particularly the induction period on concentrations of gallic acid and methyl gallate due to their high antioxidant activity to retard oil oxidation and enhanced Kilka oil shelf life. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Kang, Jin Kyu; Hong, Hyung Gil; Park, Kang Ryoung
2017-07-08
A number of studies have been conducted to enhance the pedestrian detection accuracy of intelligent surveillance systems. However, detecting pedestrians under outdoor conditions is a challenging problem due to the varying lighting, shadows, and occlusions. In recent times, a growing number of studies have been performed on visible light camera-based pedestrian detection systems using a convolutional neural network (CNN) in order to make the pedestrian detection process more resilient to such conditions. However, visible light cameras still cannot detect pedestrians during nighttime, and are easily affected by shadows and lighting. There are many studies on CNN-based pedestrian detection through the use of far-infrared (FIR) light cameras (i.e., thermal cameras) to address such difficulties. However, when the solar radiation increases and the background temperature reaches the same level as the body temperature, it remains difficult for the FIR light camera to detect pedestrians due to the insignificant difference between the pedestrian and non-pedestrian features within the images. Researchers have been trying to solve this issue by inputting both the visible light and the FIR camera images into the CNN as the input. This, however, takes a longer time to process, and makes the system structure more complex as the CNN needs to process both camera images. This research adaptively selects a more appropriate candidate between two pedestrian images from visible light and FIR cameras based on a fuzzy inference system (FIS), and the selected candidate is verified with a CNN. Three types of databases were tested, taking into account various environmental factors using visible light and FIR cameras. The results showed that the proposed method performs better than the previously reported methods.
He, Zhibin; Wen, Xiaohu; Liu, Hu; Du, Jun
2014-02-01
Data driven models are very useful for river flow forecasting when the underlying physical relationships are not fully understand, but it is not clear whether these data driven models still have a good performance in the small river basin of semiarid mountain regions where have complicated topography. In this study, the potential of three different data driven methods, artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS) and support vector machine (SVM) were used for forecasting river flow in the semiarid mountain region, northwestern China. The models analyzed different combinations of antecedent river flow values and the appropriate input vector has been selected based on the analysis of residuals. The performance of the ANN, ANFIS and SVM models in training and validation sets are compared with the observed data. The model which consists of three antecedent values of flow has been selected as the best fit model for river flow forecasting. To get more accurate evaluation of the results of ANN, ANFIS and SVM models, the four quantitative standard statistical performance evaluation measures, the coefficient of correlation (R), root mean squared error (RMSE), Nash-Sutcliffe efficiency coefficient (NS) and mean absolute relative error (MARE), were employed to evaluate the performances of various models developed. The results indicate that the performance obtained by ANN, ANFIS and SVM in terms of different evaluation criteria during the training and validation period does not vary substantially; the performance of the ANN, ANFIS and SVM models in river flow forecasting was satisfactory. A detailed comparison of the overall performance indicated that the SVM model performed better than ANN and ANFIS in river flow forecasting for the validation data sets. The results also suggest that ANN, ANFIS and SVM method can be successfully applied to establish river flow with complicated topography forecasting models in the semiarid mountain regions.
Energy Technology Data Exchange (ETDEWEB)
Entchev, Evgueniy; Yang, Libing [Integrated Energy Systems Laboratory, CANMET Energy Technology Centre, 1 Haanel Dr., Ottawa, Ontario (Canada)
2007-06-30
This study applies adaptive neuro-fuzzy inference system (ANFIS) techniques and artificial neural network (ANN) to predict solid oxide fuel cell (SOFC) performance while supplying both heat and power to a residence. A microgeneration 5 kW{sub el} SOFC system was installed at the Canadian Centre for Housing Technology (CCHT), integrated with existing mechanical systems and connected in parallel to the grid. SOFC performance data were collected during the winter heating season and used for training of both ANN and ANFIS models. The ANN model was built on back propagation algorithm as for ANFIS model a combination of least squares method and back propagation gradient decent method were developed and applied. Both models were trained with experimental data and used to predict selective SOFC performance parameters such as fuel cell stack current, stack voltage, etc. The study revealed that both ANN and ANFIS models' predictions agreed well with variety of experimental data sets representing steady-state, start-up and shut-down operations of the SOFC system. The initial data set was subjected to detailed sensitivity analysis and statistically insignificant parameters were excluded from the training set. As a result, significant reduction of computational time was achieved without affecting models' accuracy. The study showed that adaptive models can be applied with confidence during the design process and for performance optimization of existing and newly developed solid oxide fuel cell systems. It demonstrated that by using ANN and ANFIS techniques SOFC microgeneration system's performance could be modelled with minimum time demand and with a high degree of accuracy. (author)
Integrating distributed Bayesian inference and reinforcement learning for sensor management
Grappiolo, C.; Whiteson, S.; Pavlin, G.; Bakker, B.
2009-01-01
This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically
Scalable inference for stochastic block models
Peng, Chengbin; Zhang, Zhihua; Wong, Ka-Chun; Zhang, Xiangliang; Keyes, David E.
2017-01-01
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference
Robson, B; Boray, S
2018-04-01
Theoretical and methodological principles are presented for the construction of very large inference nets for odds calculations, composed of hundreds or many thousands or more of elements, in this paper generated by structured data mining. It is argued that the usual small inference nets can sometimes represent rather simple, arbitrary estimates. Examples of applications in clinical and public health data analysis, medical claims data and detection of irregular entries, and bioinformatics data, are presented. Construction of large nets benefits from application of a theory of expected information for sparse data and the Dirac notation and algebra. The extent to which these are important here is briefly discussed. Purposes of the study include (a) exploration of the properties of large inference nets and a perturbation and tacit conditionality models, (b) using these to propose simpler models including one that a physician could use routinely, analogous to a "risk score", (c) examination of the merit of describing optimal performance in a single measure that combines accuracy, specificity, and sensitivity in place of a ROC curve, and (d) relationship to methods for detecting anomalous and potentially fraudulent data. Copyright © 2018 Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Jensen, Finn Verner; Lauritzen, Steffen Lilholt
2001-01-01
This article describes the basic ideas and algorithms behind specification and inference in probabilistic networks based on directed acyclic graphs, undirected graphs, and chain graphs.......This article describes the basic ideas and algorithms behind specification and inference in probabilistic networks based on directed acyclic graphs, undirected graphs, and chain graphs....
Savari, Maryam; Moghaddam, Amin Hedayati; Amiri, Ahmad; Shanbedi, Mehdi; Ayub, Mohamad Nizam Bin
2017-10-01
Herein, artificial neural network and adaptive neuro-fuzzy inference system are employed for modeling the effects of important parameters on heat transfer and fluid flow characteristics of a car radiator and followed by comparing with those of the experimental results for testing data. To this end, two novel nanofluids (water/ethylene glycol-based graphene and nitrogen-doped graphene nanofluids) were experimentally synthesized. Then, Nusselt number was modeled with respect to the variation of inlet temperature, Reynolds number, Prandtl number and concentration, which were defined as the input (design) variables. To reach reliable results, we divided these data into train and test sections to accomplish modeling. Artificial networks were instructed by a major part of experimental data. The other part of primary data which had been considered for testing the appropriateness of the models was entered into artificial network models. Finally, predictad results were compared to the experimental data to evaluate validity. Confronted with high-level of validity confirmed that the proposed modeling procedure by BPNN with one hidden layer and five neurons is efficient and it can be expanded for all water/ethylene glycol-based carbon nanostructures nanofluids. Finally, we expanded our data collection from model and could present a fundamental correlation for calculating Nusselt number of the water/ethylene glycol-based nanofluids including graphene or nitrogen-doped graphene.
DEFF Research Database (Denmark)
Andersen, Jesper
2009-01-01
Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....
Daniel J. Isaak; Russell F. Thurow
2006-01-01
Spatially continuous sampling designs, when temporally replicated, provide analytical flexibility and are unmatched in their ability to provide a dynamic system view. We have compiled such a data set by georeferencing the network-scale distribution of Chinook salmon (Oncorhynchus tshawytscha) redds across a large wilderness basin (7330 km2) in...
Directory of Open Access Journals (Sweden)
Harshman Lawrence G
2009-03-01
Full Text Available Abstract Background A molecular process based genotype-to-phenotype map will ultimately enable us to predict how genetic variation among individuals results in phenotypic alterations. Building such a map is, however, far from straightforward. It requires understanding how molecular variation re-shapes developmental and metabolic networks, and how the functional state of these networks modifies phenotypes in genotype specific way. We focus on the latter problem by describing genetic variation in transcript levels of genes in the InR/TOR pathway among 72 Drosophila melanogaster genotypes. Results We observe tight co-variance in transcript levels of genes not known to influence each other through direct transcriptional control. We summarize transcriptome variation with factor analyses, and observe strong co-variance of gene expression within the dFOXO-branch and within the TOR-branch of the pathway. Finally, we investigate whether major axes of transcriptome variation shape phenotypes expected to be influenced through the InR/TOR pathway. We find limited evidence that transcript levels of individual upstream genes in the InR/TOR pathway predict fly phenotypes in expected ways. However, there is no evidence that these effects are mediated through the major axes of downstream transcriptome variation. Conclusion In summary, our results question the assertion of the 'sparse' nature of genetic networks, while validating and extending candidate gene approaches in the analyses of complex traits.
Link prediction in weighted networks
DEFF Research Database (Denmark)
Wind, David Kofoed; Mørup, Morten
2012-01-01
Many complex networks feature relations with weight information. Some models utilize this information while other ignore the weight information when inferring the structure. In this paper we investigate if edge-weights when modeling real networks, carry important information about the network...... is to infer presence of edges, but that simpler models are better at inferring the actual weights....
Energy Technology Data Exchange (ETDEWEB)
Petrov, S.
1996-10-01
Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Geometric statistical inference
International Nuclear Information System (INIS)
Periwal, Vipul
1999-01-01
A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Bailer-Jones, Coryn A. L.
2017-04-01
Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Benaouda, D.; Wadge, G.; Whitmarsh, R. B.; Rothwell, R. G.; MacLeod, C.
1999-02-01
In boreholes with partial or no core recovery, interpretations of lithology in the remainder of the hole are routinely attempted using data from downhole geophysical sensors. We present a practical neural net-based technique that greatly enhances lithological interpretation in holes with partial core recovery by using downhole data to train classifiers to give a global classification scheme for those parts of the borehole for which no core was retrieved. We describe the system and its underlying methods of data exploration, selection and classification, and present a typical example of the system in use. Although the technique is equally applicable to oil industry boreholes, we apply it here to an Ocean Drilling Program (ODP) borehole (Hole 792E, Izu-Bonin forearc, a mixture of volcaniclastic sandstones, conglomerates and claystones). The quantitative benefits of quality-control measures and different subsampling strategies are shown. Direct comparisons between a number of discriminant analysis methods and the use of neural networks with back-propagation of error are presented. The neural networks perform better than the discriminant analysis techniques both in terms of performance rates with test data sets (2-3 per cent better) and in qualitative correlation with non-depth-matched core. We illustrate with the Hole 792E data how vital it is to have a system that permits the number and membership of training classes to be changed as analysis proceeds. The initial classification for Hole 792E evolved from a five-class to a three-class and then to a four-class scheme with resultant classification performance rates for the back-propagation neural network method of 83, 84 and 93 per cent respectively.
Nagao, Makoto
1990-01-01
Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig
Logical inference and evaluation
International Nuclear Information System (INIS)
Perey, F.G.
1981-01-01
Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given
Probability and Statistical Inference
Prosper, Harrison B.
2006-01-01
These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.
On quantum statistical inference
Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have
2018-02-15
expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation
Introductory statistical inference
Mukhopadhyay, Nitis
2014-01-01
This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist
Directory of Open Access Journals (Sweden)
Kim Clarke
2015-07-01
Full Text Available Gliomas are a highly heterogeneous group of brain tumours that are refractory to treatment, highly invasive and pro-angiogenic. Glioblastoma patients have an average survival time of less than 15 months. Understanding the molecular basis of different grades of glioma, from well differentiated, low-grade tumours to high-grade tumours, is a key step in defining new therapeutic targets. Here we use a data-driven approach to learn the structure of gene regulatory networks from observational data and use the resulting models to formulate hypothesis on the molecular determinants of glioma stage. Remarkably, integration of available knowledge with functional genomics datasets representing clinical and pre-clinical studies reveals important properties within the regulatory circuits controlling low and high-grade glioma. Our analyses first show that low and high-grade gliomas are characterised by a switch in activity of two subsets of Rho GTPases. The first one is involved in maintaining normal glial cell function, while the second is linked to the establishment of multiple hallmarks of cancer. Next, the development and application of a novel data integration methodology reveals novel functions of RND3 in controlling glioma cell migration, invasion, proliferation, angiogenesis and clinical outcome.
Song, Jiangning; Li, Fuyi; Takemoto, Kazuhiro; Haffari, Gholamreza; Akutsu, Tatsuya; Chou, Kuo-Chen; Webb, Geoffrey I
2018-04-14
Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence-structure-function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence-structure-function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations
Heidary, Saeed; Setayeshi, Saeed; Ghannadi-Maragheh, Mohammad
2014-09-01
The aim of this study is to compare the adaptive neuro-fuzzy inference system (ANFIS) and the artificial neural network (ANN) to estimate the cross-talk contamination of 99 m Tc / 201 Tl image acquisition in the 201 Tl energy window (77 ± 15% keV). GATE (Geant4 Application in Emission and Tomography) is employed due to its ability to simulate multiple radioactive sources concurrently. Two kinds of phantoms, including two digital and one physical phantom, are used. In the real and the simulation studies, data acquisition is carried out using eight energy windows. The ANN and the ANFIS are prepared in MATLAB, and the GATE results are used as a training data set. Three indications are evaluated and compared. The ANFIS method yields better outcomes for two indications (Spearman's rank correlation coefficient and contrast) and the two phantom results in each category. The maximum image biasing, which is the third indication, is found to be 6% more than that for the ANN.
Type Inference with Inequalities
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff
1991-01-01
of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...
Deep Learning for Population Genetic Inference.
Sheehan, Sara; Song, Yun S
2016-03-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference.
Directory of Open Access Journals (Sweden)
Sara Sheehan
2016-03-01
Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference
Sheehan, Sara; Song, Yun S.
2016-01-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908
Watson, Jane
2007-01-01
Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…
Hybrid Optical Inference Machines
1991-09-27
with labels. Now, events. a set of facts cal be generated in the dyadic form "u, R 1,2" Eichmann and Caulfield (19] consider the same type of and can...these enceding-schemes. These architectures are-based pri- 19. G. Eichmann and H. J. Caulfield, "Optical Learning (Inference)marily on optical inner
Thermodynamics of statistical inference by cells.
Lang, Alex H; Fisher, Charles K; Mora, Thierry; Mehta, Pankaj
2014-10-03
The deep connection between thermodynamics, computation, and information is now well established both theoretically and experimentally. Here, we extend these ideas to show that thermodynamics also places fundamental constraints on statistical estimation and learning. To do so, we investigate the constraints placed by (nonequilibrium) thermodynamics on the ability of biochemical signaling networks to estimate the concentration of an external signal. We show that accuracy is limited by energy consumption, suggesting that there are fundamental thermodynamic constraints on statistical inference.
Inference rule and problem solving
Energy Technology Data Exchange (ETDEWEB)
Goto, S
1982-04-01
Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.
Stochastic processes inference theory
Rao, Malempati M
2014-01-01
This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.
Making Type Inference Practical
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens
1992-01-01
We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...
Directory of Open Access Journals (Sweden)
João Paulo Monteiro
2001-12-01
Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.
Active inference and learning.
Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni
2016-09-01
This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Learning Convex Inference of Marginals
Domke, Justin
2012-01-01
Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...
Probabilistic inductive inference: a survey
Ambainis, Andris
2001-01-01
Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.
Multimodel inference and adaptive management
Rehme, S.E.; Powell, L.A.; Allen, Craig R.
2011-01-01
Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.
Rauno Lindholm, Daniel; Boisen Devantier, Lykke; Nyborg, Karoline Lykke; Høgsbro, Andreas; Fries, de; Skovlund, Louise
2016-01-01
The purpose of this project was to examine what influencing factor that has had an impact on the presumed increasement of the use of networking among academics on the labour market and how it is expressed. On the basis of the influence from globalization on the labour market it can be concluded that the globalization has transformed the labour market into a market based on the organization of networks. In this new organization there is a greater emphasis on employees having social qualificati...
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Emotional inferences by pragmatics
Iza-Miqueleiz, Mauricio
2017-01-01
It has for long been taken for granted that, along the course of reading a text, world knowledge is often required in order to establish coherent links between sentences (McKoon & Ratcliff 1992, Iza & Ezquerro 2000). The content grasped from a text turns out to be strongly dependent upon the reader’s additional knowledge that allows a coherent interpretation of the text as a whole. The world knowledge directing the inference may be of distinctive nature. Gygax et al. (2007) showed that m...
DEFF Research Database (Denmark)
Andersen, Jesper; Lawall, Julia
2010-01-01
A key issue in maintaining Linux device drivers is the need to keep them up to date with respect to evolutions in Linux internal libraries. Currently, there is little tool support for performing and documenting such changes. In this paper we present a tool, spdiff, that identifies common changes...... developers can use it to extract an abstract representation of the set of changes that others have made. Our experiments on recent changes in Linux show that the inferred generic patches are more concise than the corresponding patches found in commits to the Linux source tree while being safe with respect...
Ensemble stacking mitigates biases in inference of synaptic connectivity.
Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N
2018-01-01
A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.
Phylogenetic Inference of HIV Transmission Clusters
Directory of Open Access Journals (Sweden)
Vlad Novitsky
2017-10-01
Full Text Available Better understanding the structure and dynamics of HIV transmission networks is essential for designing the most efficient interventions to prevent new HIV transmissions, and ultimately for gaining control of the HIV epidemic. The inference of phylogenetic relationships and the interpretation of results rely on the definition of the HIV transmission cluster. The definition of the HIV cluster is complex and dependent on multiple factors, including the design of sampling, accuracy of sequencing, precision of sequence alignment, evolutionary models, the phylogenetic method of inference, and specified thresholds for cluster support. While the majority of studies focus on clusters, non-clustered cases could also be highly informative. A new dimension in the analysis of the global and local HIV epidemics is the concept of phylogenetically distinct HIV sub-epidemics. The identification of active HIV sub-epidemics reveals spreading viral lineages and may help in the design of targeted interventions.HIVclustering can also be affected by sampling density. Obtaining a proper sampling density may increase statistical power and reduce sampling bias, so sampling density should be taken into account in study design and in interpretation of phylogenetic results. Finally, recent advances in long-range genotyping may enable more accurate inference of HIV transmission networks. If performed in real time, it could both inform public-health strategies and be clinically relevant (e.g., drug-resistance testing.
On the Hardness of Topology Inference
Acharya, H. B.; Gouda, M. G.
Many systems require information about the topology of networks on the Internet, for purposes like management, efficiency, testing of new protocols and so on. However, ISPs usually do not share the actual topology maps with outsiders; thus, in order to obtain the topology of a network on the Internet, a system must reconstruct it from publicly observable data. The standard method employs traceroute to obtain paths between nodes; next, a topology is generated such that the observed paths occur in the graph. However, traceroute has the problem that some routers refuse to reveal their addresses, and appear as anonymous nodes in traces. Previous research on the problem of topology inference with anonymous nodes has demonstrated that it is at best NP-complete. In this paper, we improve upon this result. In our previous research, we showed that in the special case where nodes may be anonymous in some traces but not in all traces (so all node identifiers are known), there exist trace sets that are generable from multiple topologies. This paper extends our theory of network tracing to the general case (with strictly anonymous nodes), and shows that the problem of computing the network that generated a trace set, given the trace set, has no general solution. The weak version of the problem, which allows an algorithm to output a "small" set of networks- any one of which is the correct one- is also not solvable. Any algorithm guaranteed to output the correct topology outputs at least an exponential number of networks. Our results are surprisingly robust: they hold even when the network is known to have exactly two anonymous nodes, and every node as well as every edge in the network is guaranteed to occur in some trace. On the basis of this result, we suggest that exact reconstruction of network topology requires more powerful tools than traceroute.
Feature Inference Learning and Eyetracking
Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.
2009-01-01
Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
An Inference Language for Imaging
DEFF Research Database (Denmark)
Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen
2014-01-01
We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahn, Sungsoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Shin, Jinwoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of)
2017-05-25
Computing partition function is the most important statistical inference task arising in applications of Graphical Models (GM). Since it is computationally intractable, approximate methods have been used to resolve the issue in practice, where meanfield (MF) and belief propagation (BP) are arguably the most popular and successful approaches of a variational type. In this paper, we propose two new variational schemes, coined Gauged-MF (G-MF) and Gauged-BP (G-BP), improving MF and BP, respectively. Both provide lower bounds for the partition function by utilizing the so-called gauge transformation which modifies factors of GM while keeping the partition function invariant. Moreover, we prove that both G-MF and G-BP are exact for GMs with a single loop of a special structure, even though the bare MF and BP perform badly in this case. Our extensive experiments, on complete GMs of relatively small size and on large GM (up-to 300 variables) confirm that the newly proposed algorithms outperform and generalize MF and BP.
Social Inference Through Technology
Oulasvirta, Antti
Awareness cues are computer-mediated, real-time indicators of people’s undertakings, whereabouts, and intentions. Already in the mid-1970 s, UNIX users could use commands such as “finger” and “talk” to find out who was online and to chat. The small icons in instant messaging (IM) applications that indicate coconversants’ presence in the discussion space are the successors of “finger” output. Similar indicators can be found in online communities, media-sharing services, Internet relay chat (IRC), and location-based messaging applications. But presence and availability indicators are only the tip of the iceberg. Technological progress has enabled richer, more accurate, and more intimate indicators. For example, there are mobile services that allow friends to query and follow each other’s locations. Remote monitoring systems developed for health care allow relatives and doctors to assess the wellbeing of homebound patients (see, e.g., Tang and Venables 2000). But users also utilize cues that have not been deliberately designed for this purpose. For example, online gamers pay attention to other characters’ behavior to infer what the other players are like “in real life.” There is a common denominator underlying these examples: shared activities rely on the technology’s representation of the remote person. The other human being is not physically present but present only through a narrow technological channel.
Cortical information flow during inferences of agency
Directory of Open Access Journals (Sweden)
Myrthel eDogge
2014-08-01
Full Text Available Building on the recent finding that agency experiences do not merely rely on sensorimotor information but also on cognitive cues, this exploratory study uses electroencephalographic recordings to examine functional connectivity during agency inference processing in a setting where action and outcome are independent. Participants completed a computerized task in which they pressed a button followed by one of two color words (red or blue and rated their experienced agency over producing the color. Before executing the action, a matching or mismatching color word was pre-activated by explicitly instructing participants to produce the color (goal condition or by briefly presenting the color word (prime condition. In both conditions, experienced agency was higher in matching versus mismatching trials. Furthermore, increased electroencephalography (EEG-based connectivity strength was observed between parietal and frontal nodes and within the (prefrontal cortex when color-outcomes matched with goals and participants reported high agency. This pattern of increased connectivity was not identified in trials where outcomes were pre-activated through primes. These results suggest that different connections are involved in the experience and in the loss of agency, as well as in inferences of agency resulting from different types of pre-activation. Moreover, the findings provide novel support for the involvement of a fronto-parietal network in agency inferences.
Ensemble stacking mitigates biases in inference of synaptic connectivity
Directory of Open Access Journals (Sweden)
Brendan Chambers
2018-03-01
Full Text Available A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches. Mapping the routing of spikes through local circuitry is crucial for understanding neocortical computation. Under appropriate experimental conditions, these maps can be used to infer likely patterns of synaptic recruitment, linking activity to underlying anatomical connections. Such inferences help to reveal the synaptic implementation of population dynamics and computation. We compare a number of standard functional measures to infer underlying connectivity. We find that regularization impacts measures
Optimization methods for logical inference
Chandru, Vijay
2011-01-01
Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
The effects of incomplete protein interaction data on structural and evolutionary inferences
DEFF Research Database (Denmark)
de Silva, E; Thorne, T; Ingram, P
2006-01-01
of the inherent noise in protein interaction data. The effects of the incomplete nature of network data become very noticeable, especially for so-called network motifs. We also consider the effect of incomplete network data on functional and evolutionary inferences. Conclusion Crucially, when only small, partial...
On principles of inductive inference
Kostecki, Ryszard Paweł
2011-01-01
We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.
Statistical inference via fiducial methods
Salomé, Diemer
1998-01-01
In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary
Statistical inference for stochastic processes
National Research Council Canada - National Science Library
Basawa, Ishwar V; Prakasa Rao, B. L. S
1980-01-01
The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...
Active inference, communication and hermeneutics.
Friston, Karl J; Frith, Christopher D
2015-07-01
Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity.
Pecevski, Dejan; Maass, Wolfgang
2016-01-01
Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p (*) that generates the examples it receives. This holds even if p (*) contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference.
Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity123
Pecevski, Dejan
2016-01-01
Abstract Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p* that generates the examples it receives. This holds even if p* contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference. PMID:27419214
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2018-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
On Maximum Entropy and Inference
Directory of Open Access Journals (Sweden)
Luigi Gresele
2017-11-01
Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.
Baselines and test data for cross-lingual inference
DEFF Research Database (Denmark)
Agic, Zeljko; Schluter, Natalie
2018-01-01
The recent years have seen a revival of interest in textual entailment, sparked by i) the emergence of powerful deep neural network learners for natural language processing and ii) the timely development of large-scale evaluation datasets such as SNLI. Recast as natural language inference......, the problem now amounts to detecting the relation between pairs of statements: they either contradict or entail one another, or they are mutually neutral. Current research in natural language inference is effectively exclusive to English. In this paper, we propose to advance the research in SNLI-style natural...... language inference toward multilingual evaluation. To that end, we provide test data for four major languages: Arabic, French, Spanish, and Russian. We experiment with a set of baselines. Our systems are based on cross-lingual word embeddings and machine translation. While our best system scores an average...
Inferring Human Mobility from Sparse Low Accuracy Mobile Sensing Data
DEFF Research Database (Denmark)
Cuttone, Andrea; Jørgensen, Sune Lehmann; Larsen, Jakob Eg
2014-01-01
Understanding both collective and personal human mobility is a central topic in Computational Social Science. Smartphone sensing data is emerging as a promising source for studying human mobility. However, most literature focuses on high-precision GPS positioning and high-frequency sampling, which...... is not always feasible in a longitudinal study or for everyday applications because location sensing has a high battery cost. In this paper we study the feasibility of inferring human mobility from sparse, low accuracy mobile sensing data. We validate our results using participants' location diaries......, and analyze the inferred geographical networks, the time spent at different places, and the number of unique places over time. Our results suggest that low resolution data allows accurate inference of human mobility patterns....
Eight challenges in phylodynamic inference
Directory of Open Access Journals (Sweden)
Simon D.W. Frost
2015-03-01
Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.
Problem solving and inference mechanisms
Energy Technology Data Exchange (ETDEWEB)
Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A
1982-01-01
The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.
Directory of Open Access Journals (Sweden)
Saurabh Vashishtha
Full Text Available There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE. Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Object-Oriented Type Inference
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Palsberg, Jens
1991-01-01
We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...
Mixed normal inference on multicointegration
Boswijk, H.P.
2009-01-01
Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the
Statistical inference and Aristotle's Rhetoric.
Macdonald, Ranald R
2004-11-01
Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
Perceptual learning as improved probabilistic inference in early sensory areas.
Bejjanki, Vikranth R; Beck, Jeffrey M; Lu, Zhong-Lin; Pouget, Alexandre
2011-05-01
Extensive training on simple tasks such as fine orientation discrimination results in large improvements in performance, a form of learning known as perceptual learning. Previous models have argued that perceptual learning is due to either sharpening and amplification of tuning curves in early visual areas or to improved probabilistic inference in later visual areas (at the decision stage). However, early theories are inconsistent with the conclusions of psychophysical experiments manipulating external noise, whereas late theories cannot explain the changes in neural responses that have been reported in cortical areas V1 and V4. Here we show that we can capture both the neurophysiological and behavioral aspects of perceptual learning by altering only the feedforward connectivity in a recurrent network of spiking neurons so as to improve probabilistic inference in early visual areas. The resulting network shows modest changes in tuning curves, in line with neurophysiological reports, along with a marked reduction in the amplitude of pairwise noise correlations.
Neuronal integration of dynamic sources: Bayesian learning and Bayesian inference.
Siegelmann, Hava T; Holzman, Lars E
2010-09-01
One of the brain's most basic functions is integrating sensory data from diverse sources. This ability causes us to question whether the neural system is computationally capable of intelligently integrating data, not only when sources have known, fixed relative dependencies but also when it must determine such relative weightings based on dynamic conditions, and then use these learned weightings to accurately infer information about the world. We suggest that the brain is, in fact, fully capable of computing this parallel task in a single network and describe a neural inspired circuit with this property. Our implementation suggests the possibility that evidence learning requires a more complex organization of the network than was previously assumed, where neurons have different specialties, whose emergence brings the desired adaptivity seen in human online inference.
Classical methods for interpreting objective function minimization as intelligent inference
Energy Technology Data Exchange (ETDEWEB)
Golden, R.M. [Univ. of Texas, Dallas, TX (United States)
1996-12-31
Most recognition algorithms and neural networks can be formally viewed as seeking a minimum value of an appropriate objective function during either classification or learning phases. The goal of this paper is to argue that in order to show a recognition algorithm is making intelligent inferences, it is not sufficient to show that the recognition algorithm is computing (or trying to compute) the global minimum of some objective function. One must explicitly define a {open_quotes}relational system{close_quotes} for the recognition algorithm or neural network which identifies the: (i) sample space, (ii) the relevant sigmafield of events generated by the sample space, and (iii) the {open_quotes}relation{close_quotes} for that relational system. Only when such a {open_quotes}relational system{close_quotes} is properly defined, is it possible to formally establish the sense in which computing the global minimum of an objective function is an intelligent, inference.
Inferring modules from human protein interactome classes
Directory of Open Access Journals (Sweden)
Chaurasia Gautam
2010-07-01
Full Text Available Abstract Background The integration of protein-protein interaction networks derived from high-throughput screening approaches and complementary sources is a key topic in systems biology. Although integration of protein interaction data is conventionally performed, the effects of this procedure on the result of network analyses has not been examined yet. In particular, in order to optimize the fusion of heterogeneous interaction datasets, it is crucial to consider not only their degree of coverage and accuracy, but also their mutual dependencies and additional salient features. Results We examined this issue based on the analysis of modules detected by network clustering methods applied to both integrated and individual (disaggregated data sources, which we call interactome classes. Due to class diversity, we deal with variable dependencies of data features arising from structural specificities and biases, but also from possible overlaps. Since highly connected regions of the human interactome may point to potential protein complexes, we have focused on the concept of modularity, and elucidated the detection power of module extraction algorithms by independent validations based on GO, MIPS and KEGG. From the combination of protein interactions with gene expressions, a confidence scoring scheme has been proposed before proceeding via GO with further classification in permanent and transient modules. Conclusions Disaggregated interactomes are shown to be informative for inferring modularity, thus contributing to perform an effective integrative analysis. Validation of the extracted modules by multiple annotation allows for the assessment of confidence measures assigned to the modules in a protein pathway context. Notably, the proposed multilayer confidence scheme can be used for network calibration by enabling a transition from unweighted to weighted interactomes based on biological evidence.
Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals.
Chen, Daizhuo; Fraiberger, Samuel P; Moakler, Robert; Provost, Foster
2017-09-01
Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from "Likes" on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the "cloaking device"-a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.
Statistical network analysis for analyzing policy networks
DEFF Research Database (Denmark)
Robins, Garry; Lewis, Jenny; Wang, Peng
2012-01-01
and policy network methodology is the development of statistical modeling approaches that can accommodate such dependent data. In this article, we review three network statistical methods commonly used in the current literature: quadratic assignment procedures, exponential random graph models (ERGMs......To analyze social network data using standard statistical approaches is to risk incorrect inference. The dependencies among observations implied in a network conceptualization undermine standard assumptions of the usual general linear models. One of the most quickly expanding areas of social......), and stochastic actor-oriented models. We focus most attention on ERGMs by providing an illustrative example of a model for a strategic information network within a local government. We draw inferences about the structural role played by individuals recognized as key innovators and conclude that such an approach...
Numerical approximations for speeding up mcmc inference in the infinite relational model
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon
2015-01-01
The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations...
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Causal inference based on counterfactuals
Directory of Open Access Journals (Sweden)
Höfler M
2005-09-01
Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.
System Support for Forensic Inference
Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan
Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Statistical inference on residual life
Jeong, Jong-Hyeon
2014-01-01
This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Statistical inference a short course
Panik, Michael J
2012-01-01
A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal
On Quantum Statistical Inference, II
Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
Personalized recommendation via unbalance full-connectivity inference
Ma, Wenping; Ren, Chen; Wu, Yue; Wang, Shanfeng; Feng, Xiang
2017-10-01
Recommender systems play an important role to help us to find useful information. They are widely used by most e-commerce web sites to push the potential items to individual user according to purchase history. Network-based recommendation algorithms are popular and effective in recommendation, which use two types of elements to represent users and items respectively. In this paper, based on consistence-based inference (CBI) algorithm, we propose a novel network-based algorithm, in which users and items are recognized with no difference. The proposed algorithm also uses information diffusion to find the relationship between users and items. Different from traditional network-based recommendation algorithms, information diffusion initializes from users and items, respectively. Experiments show that the proposed algorithm is effective compared with traditional network-based recommendation algorithms.
Variational inference & deep learning : A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning: A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Continuous Integrated Invariant Inference, Phase I
National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Adaptive Inference on General Graphical Models
Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur
2012-01-01
Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...
Network mapping and usage determination
CSIR Research Space (South Africa)
Senekal, FP
2007-07-01
Full Text Available detection. Based on this information, topology determination techniques can be applied to infer network structure from the information. Techniques to visualise the information are discussed. IP geolocation (the ability to associate a geographical coordinate...
Bayesian inference for hybrid discrete-continuous stochastic kinetic models
International Nuclear Information System (INIS)
Sherlock, Chris; Golightly, Andrew; Gillespie, Colin S
2014-01-01
We consider the problem of efficiently performing simulation and inference for stochastic kinetic models. Whilst it is possible to work directly with the resulting Markov jump process (MJP), computational cost can be prohibitive for networks of realistic size and complexity. In this paper, we consider an inference scheme based on a novel hybrid simulator that classifies reactions as either ‘fast’ or ‘slow’ with fast reactions evolving as a continuous Markov process whilst the remaining slow reaction occurrences are modelled through a MJP with time-dependent hazards. A linear noise approximation (LNA) of fast reaction dynamics is employed and slow reaction events are captured by exploiting the ability to solve the stochastic differential equation driving the LNA. This simulation procedure is used as a proposal mechanism inside a particle MCMC scheme, thus allowing Bayesian inference for the model parameters. We apply the scheme to a simple application and compare the output with an existing hybrid approach and also a scheme for performing inference for the underlying discrete stochastic model. (paper)
Sweller, Naomi; Hayes, Brett K
2010-08-01
Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
Generative inference for cultural evolution.
Kandler, Anne; Powell, Adam
2018-04-05
One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).
sick: The Spectroscopic Inference Crank
Casey, Andrew R.
2016-03-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
SICK: THE SPECTROSCOPIC INFERENCE CRANK
Energy Technology Data Exchange (ETDEWEB)
Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)
2016-03-15
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
SICK: THE SPECTROSCOPIC INFERENCE CRANK
International Nuclear Information System (INIS)
Casey, Andrew R.
2016-01-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Visual recognition and inference using dynamic overcomplete sparse learning.
Murray, Joseph F; Kreutz-Delgado, Kenneth
2007-09-01
We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
Subjective randomness as statistical inference.
Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B
2018-06-01
Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
IMNN: Information Maximizing Neural Networks
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
This software trains artificial neural networks to find non-linear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). As compressing large data sets vastly simplifies both frequentist and Bayesian inference, important information may be inadvertently missed. Likelihood-free inference based on automatically derived IMNN summaries produces summaries that are good approximations to sufficient statistics. IMNNs are robustly capable of automatically finding optimal, non-linear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima.
Inference and learning in sparse systems with multiple states
International Nuclear Information System (INIS)
Braunstein, A.; Ramezanpour, A.; Zhang, P.; Zecchina, R.
2011-01-01
We discuss how inference can be performed when data are sampled from the nonergodic phase of systems with multiple attractors. We take as a model system the finite connectivity Hopfield model in the memory phase and suggest a cavity method approach to reconstruct the couplings when the data are separately sampled from few attractor states. We also show how the inference results can be converted into a learning protocol for neural networks in which patterns are presented through weak external fields. The protocol is simple and fully local, and is able to store patterns with a finite overlap with the input patterns without ever reaching a spin-glass phase where all memories are lost.
Scalable inference for stochastic block models
Peng, Chengbin
2017-12-08
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Inferring Molecular Processes Heterogeneity from Transcriptional Data.
Gogolewski, Krzysztof; Wronowska, Weronika; Lech, Agnieszka; Lesyng, Bogdan; Gambin, Anna
2017-01-01
RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.
A grammar inference approach for predicting kinase specific phosphorylation sites.
Datta, Sutapa; Mukhopadhyay, Subhasis
2015-01-01
Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner.
A Grammar Inference Approach for Predicting Kinase Specific Phosphorylation Sites
Datta, Sutapa; Mukhopadhyay, Subhasis
2015-01-01
Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner. PMID:25886273
Lower complexity bounds for lifted inference
DEFF Research Database (Denmark)
Jaeger, Manfred
2015-01-01
instances of the model. Numerous approaches for such “lifted inference” techniques have been proposed. While it has been demonstrated that these techniques will lead to significantly more efficient inference on some specific models, there are only very recent and still quite restricted results that show...... the feasibility of lifted inference on certain syntactically defined classes of models. Lower complexity bounds that imply some limitations for the feasibility of lifted inference on more expressive model classes were established earlier in Jaeger (2000; Jaeger, M. 2000. On the complexity of inference about...... that under the assumption that NETIME≠ETIME, there is no polynomial lifted inference algorithm for knowledge bases of weighted, quantifier-, and function-free formulas. Further strengthening earlier results, this is also shown to hold for approximate inference and for knowledge bases not containing...
Statistical inference for financial engineering
Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki
2014-01-01
This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.
Type inference for correspondence types
DEFF Research Database (Denmark)
Hüttel, Hans; Gordon, Andy; Hansen, Rene Rydhof
2009-01-01
We present a correspondence type/effect system for authenticity in a π-calculus with polarized channels, dependent pair types and effect terms and show how one may, given a process P and an a priori type environment E, generate constraints that are formulae in the Alternating Least Fixed......-Point (ALFP) logic. We then show how a reasonable model of the generated constraints yields a type/effect assignment such that P becomes well-typed with respect to E if and only if this is possible. The formulae generated satisfy a finite model property; a system of constraints is satisfiable if and only...... if it has a finite model. As a consequence, we obtain the result that type/effect inference in our system is polynomial-time decidable....
Causal inference in public health.
Glass, Thomas A; Goodman, Steven N; Hernán, Miguel A; Samet, Jonathan M
2013-01-01
Causal inference has a central role in public health; the determination that an association is causal indicates the possibility for intervention. We review and comment on the long-used guidelines for interpreting evidence as supporting a causal association and contrast them with the potential outcomes framework that encourages thinking in terms of causes that are interventions. We argue that in public health this framework is more suitable, providing an estimate of an action's consequences rather than the less precise notion of a risk factor's causal effect. A variety of modern statistical methods adopt this approach. When an intervention cannot be specified, causal relations can still exist, but how to intervene to change the outcome will be unclear. In application, the often-complex structure of causal processes needs to be acknowledged and appropriate data collected to study them. These newer approaches need to be brought to bear on the increasingly complex public health challenges of our globalized world.
Extracting information from multiplex networks
Iacovacci, Jacopo; Bianconi, Ginestra
2016-06-01
Multiplex networks are generalized network structures that are able to describe networks in which the same set of nodes are connected by links that have different connotations. Multiplex networks are ubiquitous since they describe social, financial, engineering, and biological networks as well. Extending our ability to analyze complex networks to multiplex network structures increases greatly the level of information that is possible to extract from big data. For these reasons, characterizing the centrality of nodes in multiplex networks and finding new ways to solve challenging inference problems defined on multiplex networks are fundamental questions of network science. In this paper, we discuss the relevance of the Multiplex PageRank algorithm for measuring the centrality of nodes in multilayer networks and we characterize the utility of the recently introduced indicator function Θ ˜ S for describing their mesoscale organization and community structure. As working examples for studying these measures, we consider three multiplex network datasets coming for social science.
Inference Attacks and Control on Database Structures
Directory of Open Access Journals (Sweden)
Muhamed Turkanovic
2015-02-01
Full Text Available Today’s databases store information with sensitivity levels that range from public to highly sensitive, hence ensuring confidentiality can be highly important, but also requires costly control. This paper focuses on the inference problem on different database structures. It presents possible treats on privacy with relation to the inference, and control methods for mitigating these treats. The paper shows that using only access control, without any inference control is inadequate, since these models are unable to protect against indirect data access. Furthermore, it covers new inference problems which rise from the dimensions of new technologies like XML, semantics, etc.
LAIT: a local ancestry inference toolkit.
Hui, Daniel; Fang, Zhou; Lin, Jerome; Duan, Qing; Li, Yun; Hu, Ming; Chen, Wei
2017-09-06
Inferring local ancestry in individuals of mixed ancestry has many applications, most notably in identifying disease-susceptible loci that vary among different ethnic groups. Many software packages are available for inferring local ancestry in admixed individuals. However, most of these existing software packages require specific formatted input files and generate output files in various types, yielding practical inconvenience. We developed a tool set, Local Ancestry Inference Toolkit (LAIT), which can convert standardized files into software-specific input file formats as well as standardize and summarize inference results for four popular local ancestry inference software: HAPMIX, LAMP, LAMP-LD, and ELAI. We tested LAIT using both simulated and real data sets and demonstrated that LAIT provides convenience to run multiple local ancestry inference software. In addition, we evaluated the performance of local ancestry software among different supported software packages, mainly focusing on inference accuracy and computational resources used. We provided a toolkit to facilitate the use of local ancestry inference software, especially for users with limited bioinformatics background.
Forward and backward inference in spatial cognition.
Directory of Open Access Journals (Sweden)
Will D Penny
Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
Generative Inferences Based on Learned Relations
Chen, Dawn; Lu, Hongjing; Holyoak, Keith J.
2017-01-01
A key property of relational representations is their "generativity": From partial descriptions of relations between entities, additional inferences can be drawn about other entities. A major theoretical challenge is to demonstrate how the capacity to make generative inferences could arise as a result of learning relations from…
Inference in models with adaptive learning
Chevillon, G.; Massmann, M.; Mavroeidis, S.
2010-01-01
Identification of structural parameters in models with adaptive learning can be weak, causing standard inference procedures to become unreliable. Learning also induces persistent dynamics, and this makes the distribution of estimators and test statistics non-standard. Valid inference can be
Fiducial inference - A Neyman-Pearson interpretation
Salome, D; VonderLinden, W; Dose,; Fischer, R; Preuss, R
1999-01-01
Fisher's fiducial argument is a tool for deriving inferences in the form of a probability distribution on the parameter space, not based on Bayes's Theorem. Lindley established that in exceptional situations fiducial inferences coincide with posterior distributions; in the other situations fiducial
Uncertainty in prediction and in inference
Hilgevoord, J.; Uffink, J.
1991-01-01
The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close re-lationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in
Causal inference in economics and marketing.
Varian, Hal R
2016-07-05
This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.
Nonparametric predictive inference in statistical process control
Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.
2000-01-01
New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on
The Impact of Disablers on Predictive Inference
Cummins, Denise Dellarosa
2014-01-01
People consider alternative causes when deciding whether a cause is responsible for an effect (diagnostic inference) but appear to neglect them when deciding whether an effect will occur (predictive inference). Five experiments were conducted to test a 2-part explanation of this phenomenon: namely, (a) that people interpret standard predictive…
Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models.
Directory of Open Access Journals (Sweden)
Richard R Stein
2015-07-01
Full Text Available Maximum entropy-based inference methods have been successfully used to infer direct interactions from biological datasets such as gene expression data or sequence ensembles. Here, we review undirected pairwise maximum-entropy probability models in two categories of data types, those with continuous and categorical random variables. As a concrete example, we present recently developed inference methods from the field of protein contact prediction and show that a basic set of assumptions leads to similar solution strategies for inferring the model parameters in both variable types. These parameters reflect interactive couplings between observables, which can be used to predict global properties of the biological system. Such methods are applicable to the important problems of protein 3-D structure prediction and association of gene-gene networks, and they enable potential applications to the analysis of gene alteration patterns and to protein design.
Maua, Denis Deratani; Cozman, Fabio Gagli; Conaty, Diarmaid; de Campos, Cassio P.
2017-01-01
Sum-product networks are a relatively new and increasingly popular class of (precise) probabilistic graphical models that allow for marginal inference with polynomial effort. As with other probabilistic models, sum-product networks are often learned from data and used to perform classification.
Extended likelihood inference in reliability
International Nuclear Information System (INIS)
Martz, H.F. Jr.; Beckman, R.J.; Waller, R.A.
1978-10-01
Extended likelihood methods of inference are developed in which subjective information in the form of a prior distribution is combined with sampling results by means of an extended likelihood function. The extended likelihood function is standardized for use in obtaining extended likelihood intervals. Extended likelihood intervals are derived for the mean of a normal distribution with known variance, the failure-rate of an exponential distribution, and the parameter of a binomial distribution. Extended second-order likelihood methods are developed and used to solve several prediction problems associated with the exponential and binomial distributions. In particular, such quantities as the next failure-time, the number of failures in a given time period, and the time required to observe a given number of failures are predicted for the exponential model with a gamma prior distribution on the failure-rate. In addition, six types of life testing experiments are considered. For the binomial model with a beta prior distribution on the probability of nonsurvival, methods are obtained for predicting the number of nonsurvivors in a given sample size and for predicting the required sample size for observing a specified number of nonsurvivors. Examples illustrate each of the methods developed. Finally, comparisons are made with Bayesian intervals in those cases where these are known to exist
Reinforcement learning or active inference?
Friston, Karl J; Daunizeau, Jean; Kiebel, Stefan J
2009-07-29
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.
Reinforcement learning or active inference?
Directory of Open Access Journals (Sweden)
Karl J Friston
2009-07-01
Full Text Available This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.
Active inference and epistemic value.
Friston, Karl; Rigoli, Francesco; Ognibene, Dimitri; Mathys, Christoph; Fitzgerald, Thomas; Pezzulo, Giovanni
2015-01-01
We offer a formal treatment of choice behavior based on the premise that agents minimize the expected free energy of future outcomes. Crucially, the negative free energy or quality of a policy can be decomposed into extrinsic and epistemic (or intrinsic) value. Minimizing expected free energy is therefore equivalent to maximizing extrinsic value or expected utility (defined in terms of prior preferences or goals), while maximizing information gain or intrinsic value (or reducing uncertainty about the causes of valuable outcomes). The resulting scheme resolves the exploration-exploitation dilemma: Epistemic value is maximized until there is no further information gain, after which exploitation is assured through maximization of extrinsic value. This is formally consistent with the Infomax principle, generalizing formulations of active vision based upon salience (Bayesian surprise) and optimal decisions based on expected utility and risk-sensitive (Kullback-Leibler) control. Furthermore, as with previous active inference formulations of discrete (Markovian) problems, ad hoc softmax parameters become the expected (Bayes-optimal) precision of beliefs about, or confidence in, policies. This article focuses on the basic theory, illustrating the ideas with simulations. A key aspect of these simulations is the similarity between precision updates and dopaminergic discharges observed in conditioning paradigms.
Ancient Biomolecules and Evolutionary Inference.
Cappellini, Enrico; Prohaska, Ana; Racimo, Fernando; Welker, Frido; Pedersen, Mikkel Winther; Allentoft, Morten E; de Barros Damgaard, Peter; Gutenbrunner, Petra; Dunne, Julie; Hammann, Simon; Roffet-Salque, Mélanie; Ilardo, Melissa; Moreno-Mayar, J Víctor; Wang, Yucheng; Sikora, Martin; Vinner, Lasse; Cox, Jürgen; Evershed, Richard P; Willerslev, Eske
2018-04-25
Over the last decade, studies of ancient biomolecules-particularly ancient DNA, proteins, and lipids-have revolutionized our understanding of evolutionary history. Though initially fraught with many challenges, the field now stands on firm foundations. Researchers now successfully retrieve nucleotide and amino acid sequences, as well as lipid signatures, from progressively older samples, originating from geographic areas and depositional environments that, until recently, were regarded as hostile to long-term preservation of biomolecules. Sampling frequencies and the spatial and temporal scope of studies have also increased markedly, and with them the size and quality of the data sets generated. This progress has been made possible by continuous technical innovations in analytical methods, enhanced criteria for the selection of ancient samples, integrated experimental methods, and advanced computational approaches. Here, we discuss the history and current state of ancient biomolecule research, its applications to evolutionary inference, and future directions for this young and exciting field. Expected final online publication date for the Annual Review of Biochemistry Volume 87 is June 20, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Inferring Social Influence of Anti-Tobacco Mass Media Campaign.
Zhan, Qianyi; Zhang, Jiawei; Yu, Philip S; Emery, Sherry; Xie, Junyuan
2017-07-01
Anti-tobacco mass media campaigns are designed to influence tobacco users. It has been proved that campaigns will produce users' changes in awareness, knowledge, and attitudes, and also produce meaningful behavior change of audience. Anti-smoking television advertising is the most important part in the campaign. Meanwhile, nowadays, successful online social networks are creating new media environment, however, little is known about the relation between social conversations and anti-tobacco campaigns. This paper aims to infer social influence of these campaigns, and the problem is formally referred to as the Social Influence inference of anti-Tobacco mass mEdia campaigns (Site) problem. To address the Site problem, a novel influence inference framework, TV advertising social influence estimation (Asie), is proposed based on our analysis of two real anti-tobacco campaigns. Asie divides audience attitudes toward TV ads into three distinct stages: 1) cognitive; 2) affective; and 3) conative. Audience online reactions at each of these three stages are depicted by Asie with specific probabilistic models based on the synergistic influences from both online social friends and offline TV ads. Extensive experiments demonstrate the effectiveness of Asie.
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John
2009-01-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Inference algorithms and learning theory for Bayesian sparse factor analysis
Energy Technology Data Exchange (ETDEWEB)
Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)
2009-12-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Evidence Accumulation and Change Rate Inference in Dynamic Environments.
Radillo, Adrian E; Veliz-Cuba, Alan; Josić, Krešimir; Kilpatrick, Zachary P
2017-06-01
In a constantly changing world, animals must account for environmental volatility when making decisions. To appropriately discount older, irrelevant information, they need to learn the rate at which the environment changes. We develop an ideal observer model capable of inferring the present state of the environment along with its rate of change. Key to this computation is an update of the posterior probability of all possible change point counts. This computation can be challenging, as the number of possibilities grows rapidly with time. However, we show how the computations can be simplified in the continuum limit by a moment closure approximation. The resulting low-dimensional system can be used to infer the environmental state and change rate with accuracy comparable to the ideal observer. The approximate computations can be performed by a neural network model via a rate-correlation-based plasticity rule. We thus show how optimal observers accumulate evidence in changing environments and map this computation to reduced models that perform inference using plausible neural mechanisms.
Bootstrap-based Support of HGT Inferred by Maximum Parsimony
Directory of Open Access Journals (Sweden)
Nakhleh Luay
2010-05-01
Full Text Available Abstract Background Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/, and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.
Bootstrap-based support of HGT inferred by maximum parsimony.
Park, Hyun Jung; Jin, Guohua; Nakhleh, Luay
2010-05-05
Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
EI: A Program for Ecological Inference
Directory of Open Access Journals (Sweden)
Gary King
2004-09-01
Full Text Available The program EI provides a method of inferring individual behavior from aggregate data. It implements the statistical procedures, diagnostics, and graphics from the book A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data (King 1997. Ecological inference, as traditionally defined, is the process of using aggregate (i.e., "ecological" data to infer discrete individual-level relationships of interest when individual-level data are not available. Ecological inferences are required in political science research when individual-level surveys are unavailable (e.g., local or comparative electoral politics, unreliable (racial politics, insufficient (political geography, or infeasible (political history. They are also required in numerous areas of ma jor significance in public policy (e.g., for applying the Voting Rights Act and other academic disciplines ranging from epidemiology and marketing to sociology and quantitative history.
Gao, F.
2017-01-01
The dissertation consists of research in three subjects in two themes—Bayes and networks: The first studies the posterior contraction rates for the Dirichlet-Laplace mixtures in a deconvolution setting (Chapter 1). The second subject regards the statistical inference in preferential attachment
Dam, van J.C.J.; Schaap, P.J.; Martins dos Santos, V.A.P.; Suarez Diez, M.
2014-01-01
Background: Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each
VINE: A Variational Inference -Based Bayesian Neural Network Engine
2018-01-01
word to be written back to the seed RAM. The results generated by every four LF-updaters are selected sequentially by 4 outputs with different...suitable for hardware implementation. Sequential mini-batch gradient descent (SMBGD) optimization is an update rule we propose to integrate with EASI...the layer computation block can perform forward propagation from at most eight input neurons to at most eight output neurons simultaneously . If a
Statistical Inferences from the Topology of Complex Networks
2016-10-04
stable, does not lose any information, has continuous and discrete versions, and obeys a strong law of large numbers and a central limit theorem. The...paper (with J.A. Scott) “Categorification of persistent homology” [7] in the journal Discrete and Computational Geome- try and the paper “Metrics for...Generalized Persistence Modules” (with J.A. Scott and V. de Silva) in the journal Foundations of Computational Math - ematics [5]. These papers develop
Inferring Network Controls from Topology Using the Chomp Database
2015-12-03
the-art methods in gene expres- sion analysis. We call this new method SW1PerS, which stands for Sliding Windows and 1-dimensional Persistence Scoring...protein compressibility. Jpn. J. Ind. Appl. Math . 32, 1 (2015), 1–17. [5] Goullet, A., Harker, S., Kalies, W., Kasti, D., and Mischaikow, K. Ef- ficient...computing homology of complexes and maps. Found. Comput. Math . 14, 1 (2014), 151–184. [9] Kalies, W. D., Mischaikow, K., and Vandervorst, R. C. Lattice
Detecting Hierarchical Structure in Networks
DEFF Research Database (Denmark)
Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard
2012-01-01
Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....
Directory of Open Access Journals (Sweden)
Xiufang Lin
2016-08-01
Full Text Available Magnetorheological dampers have become prominent semi-active control devices for vibration mitigation of structures which are subjected to severe loads. However, the damping force cannot be controlled directly due to the inherent nonlinear characteristics of the magnetorheological dampers. Therefore, for fully exploiting the capabilities of the magnetorheological dampers, one of the challenging aspects is to develop an accurate inverse model which can appropriately predict the input voltage to control the damping force. In this article, a hybrid modeling strategy combining shuffled frog-leaping algorithm and adaptive-network-based fuzzy inference system is proposed to model the inverse dynamic characteristics of the magnetorheological dampers for improving the modeling accuracy. The shuffled frog-leaping algorithm is employed to optimize the premise parameters of the adaptive-network-based fuzzy inference system while the consequent parameters are tuned by a least square estimation method, here known as shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach. To evaluate the effectiveness of the proposed approach, the inverse modeling results based on the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system approach are compared with those based on the adaptive-network-based fuzzy inference system and genetic algorithm–based adaptive-network-based fuzzy inference system approaches. Analysis of variance test is carried out to statistically compare the performance of the proposed methods and the results demonstrate that the shuffled frog-leaping algorithm-based adaptive-network-based fuzzy inference system strategy outperforms the other two methods in terms of modeling (training accuracy and checking accuracy.
Directory of Open Access Journals (Sweden)
Wang Woei-Fuh
2008-03-01
Full Text Available Abstract Background With the abundant information produced by microarray technology, various approaches have been proposed to infer transcriptional regulatory networks. However, few approaches have studied subtle and indirect interaction such as genetic compensation, the existence of which is widely recognized although its mechanism has yet to be clarified. Furthermore, when inferring gene networks most models include only observed variables whereas latent factors, such as proteins and mRNA degradation that are not measured by microarrays, do participate in networks in reality. Results Motivated by inferring transcriptional compensation (TC interactions in yeast, a stepwise structural equation modeling algorithm (SSEM is developed. In addition to observed variables, SSEM also incorporates hidden variables to capture interactions (or regulations from latent factors. Simulated gene networks are used to determine with which of six possible model selection criteria (MSC SSEM works best. SSEM with Bayesian information criterion (BIC results in the highest true positive rates, the largest percentage of correctly predicted interactions from all existing interactions, and the highest true negative (non-existing interactions rates. Next, we apply SSEM using real microarray data to infer TC interactions among (1 small groups of genes that are synthetic sick or lethal (SSL to SGS1, and (2 a group of SSL pairs of 51 yeast genes involved in DNA synthesis and repair that are of interest. For (1, SSEM with BIC is shown to outperform three Bayesian network algorithms and a multivariate autoregressive model, checked against the results of qRT-PCR experiments. The predictions for (2 are shown to coincide with several known pathways of Sgs1 and its partners that are involved in DNA replication, recombination and repair. In addition, experimentally testable interactions of Rad27 are predicted. Conclusion SSEM is a useful tool for inferring genetic networks, and the
Role of Temporal Diversity in Inferring Social Ties Based on Spatio-Temporal Data
Desai, Deshana; Nisar, Harsh; Bhardawaj, Rishab
2016-01-01
The last two decades have seen a tremendous surge in research on social networks and their implications. The studies includes inferring social relationships, which in turn have been used for target advertising, recommendations, search customization etc. However, the offline experiences of human, the conversations with people and face-to-face interactions that govern our lives interactions have received lesser attention. We introduce DAIICT Spatio-Temporal Network (DSSN), a spatiotemporal data...
The effect of nonstationarity on models inferred from neural data
International Nuclear Information System (INIS)
Tyrcha, Joanna; Roudi, Yasser; Marsili, Matteo; Hertz, John
2013-01-01
Neurons subject to a common nonstationary input may exhibit a correlated firing behavior. Correlations in the statistics of neural spike trains also arise as the effect of interaction between neurons. Here we show that these two situations can be distinguished with machine learning techniques, provided that the data are rich enough. In order to do this, we study the problem of inferring a kinetic Ising model, stationary or nonstationary, from the available data. We apply the inference procedure to two data sets: one from salamander retinal ganglion cells and the other from a realistic computational cortical network model. We show that many aspects of the concerted activity of the salamander retinal neurons can be traced simply to the external input. A model of non-interacting neurons subject to a nonstationary external field outperforms a model with stationary input with couplings between neurons, even accounting for the differences in the number of model parameters. When couplings are added to the nonstationary model, for the retinal data, little is gained: the inferred couplings are generally not significant. Likewise, the distribution of the sizes of sets of neurons that spike simultaneously and the frequency of spike patterns as a function of their rank (Zipf plots) are well explained by an independent-neuron model with time-dependent external input, and adding connections to such a model does not offer significant improvement. For the cortical model data, robust couplings, well correlated with the real connections, can be inferred using the nonstationary model. Adding connections to this model slightly improves the agreement with the data for the probability of synchronous spikes but hardly affects the Zipf plot. (paper)
The effect of nonstationarity on models inferred from neural data
Energy Technology Data Exchange (ETDEWEB)
Tyrcha, Joanna [Department of Mathematical Statistics, Stockholm University, SE-10691 Stockholm (Sweden); Roudi, Yasser [Kavli Institute for Systems Neuroscience, NTNU, NO-7010 Trondheim (Norway); Marsili, Matteo [The Abdus Salam ICTP, Strada Costiera 11, I-34151, Trieste (Italy); Hertz, John [Nordita, Royal Institute of Technology and Stockholm University, SE-106 91 Stockholm (Sweden)
2013-03-01
Neurons subject to a common nonstationary input may exhibit a correlated firing behavior. Correlations in the statistics of neural spike trains also arise as the effect of interaction between neurons. Here we show that these two situations can be distinguished with machine learning techniques, provided that the data are rich enough. In order to do this, we study the problem of inferring a kinetic Ising model, stationary or nonstationary, from the available data. We apply the inference procedure to two data sets: one from salamander retinal ganglion cells and the other from a realistic computational cortical network model. We show that many aspects of the concerted activity of the salamander retinal neurons can be traced simply to the external input. A model of non-interacting neurons subject to a nonstationary external field outperforms a model with stationary input with couplings between neurons, even accounting for the differences in the number of model parameters. When couplings are added to the nonstationary model, for the retinal data, little is gained: the inferred couplings are generally not significant. Likewise, the distribution of the sizes of sets of neurons that spike simultaneously and the frequency of spike patterns as a function of their rank (Zipf plots) are well explained by an independent-neuron model with time-dependent external input, and adding connections to such a model does not offer significant improvement. For the cortical model data, robust couplings, well correlated with the real connections, can be inferred using the nonstationary model. Adding connections to this model slightly improves the agreement with the data for the probability of synchronous spikes but hardly affects the Zipf plot. (paper)
Plug & Play object oriented Bayesian networks
DEFF Research Database (Denmark)
Bangsø, Olav; Flores, J.; Jensen, Finn Verner
2003-01-01
been shown to be quite suitable for dynamic domains as well. However, processing object oriented Bayesian networks in practice does not take advantage of their modular structure. Normally the object oriented Bayesian network is transformed into a Bayesian network and, inference is performed...... dynamic domains. The communication needed between instances is achieved by means of a fill-in propagation scheme....
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
Inferring Domain Plans in Question-Answering
National Research Council Canada - National Science Library
Pollack, Martha E
1986-01-01
The importance of plan inference in models of conversation has been widely noted in the computational-linguistics literature, and its incorporation in question-answering systems has enabled a range...
Efficient algorithms for conditional independence inference
Czech Academy of Sciences Publication Activity Database
Bouckaert, R.; Hemmecke, R.; Lindner, S.; Studený, Milan
2010-01-01
Roč. 11, č. 1 (2010), s. 3453-3479 ISSN 1532-4435 R&D Projects: GA ČR GA201/08/0539; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : conditional independence inference * linear programming approach Subject RIV: BA - General Mathematics Impact factor: 2.949, year: 2010 http://library.utia.cas.cz/separaty/2010/MTR/studeny-efficient algorithms for conditional independence inference.pdf
Zheleva, Elena
2012-01-01
This synthesis lecture provides a survey of work on privacy in online social networks (OSNs). This work encompasses concerns of users as well as service providers and third parties. Our goal is to approach such concerns from a computer-science perspective, and building upon existing work on privacy, security, statistical modeling and databases to provide an overview of the technical and algorithmic issues related to privacy in OSNs. We start our survey by introducing a simple OSN data model and describe common statistical-inference techniques that can be used to infer potentially sensitive inf
On the criticality of inferred models
Mastromatteo, Iacopo; Marsili, Matteo
2011-10-01
Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality.
On the criticality of inferred models
International Nuclear Information System (INIS)
Mastromatteo, Iacopo; Marsili, Matteo
2011-01-01
Advanced inference techniques allow one to reconstruct a pattern of interaction from high dimensional data sets, from probing simultaneously thousands of units of extended systems—such as cells, neural tissues and financial markets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to singular values of parameters, akin to critical points in physics where phase transitions occur. These are points where the response of physical systems to external perturbations, as measured by the susceptibility, is very large and diverges in the limit of infinite size. We show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher information) are directly related to the susceptibility of the inferred model. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. This region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time scales naturally yield models which are close to criticality
Polynomial Chaos Surrogates for Bayesian Inference
Le Maitre, Olivier
2016-01-06
The Bayesian inference is a popular probabilistic method to solve inverse problems, such as the identification of field parameter in a PDE model. The inference rely on the Bayes rule to update the prior density of the sought field, from observations, and derive its posterior distribution. In most cases the posterior distribution has no explicit form and has to be sampled, for instance using a Markov-Chain Monte Carlo method. In practice the prior field parameter is decomposed and truncated (e.g. by means of Karhunen- Lo´eve decomposition) to recast the inference problem into the inference of a finite number of coordinates. Although proved effective in many situations, the Bayesian inference as sketched above faces several difficulties requiring improvements. First, sampling the posterior can be a extremely costly task as it requires multiple resolutions of the PDE model for different values of the field parameter. Second, when the observations are not very much informative, the inferred parameter field can highly depends on its prior which can be somehow arbitrary. These issues have motivated the introduction of reduced modeling or surrogates for the (approximate) determination of the parametrized PDE solution and hyperparameters in the description of the prior field. Our contribution focuses on recent developments in these two directions: the acceleration of the posterior sampling by means of Polynomial Chaos expansions and the efficient treatment of parametrized covariance functions for the prior field. We also discuss the possibility of making such approach adaptive to further improve its efficiency.
Prioritising sewerage maintenance using inferred sewer age: a case study for Edinburgh.
Arthur, S; Burkhard, R
2010-01-01
The reported research project focuses on using a database which contains details of customer contacts and CCTV data for a key Scottish catchment to construct a GIS based sewer condition model. Given the nature of the asset registry, a key research challenge was estimating the age of individual lengths of pipe. Within this context, asset age was inferred using the estimated age of surface developments-this involved overlaying the network in a GIS with historical digital maps. The paper illustrates that inferred asset age can reliably be used to highlight assets which are more likely to fail.
International Nuclear Information System (INIS)
Daldaban, Ferhat; Ustkoyuncu, Nurettin; Guney, Kerim
2006-01-01
A new method based on an adaptive neuro-fuzzy inference system (ANFIS) for estimating the phase inductance of switched reluctance motors (SRMs) is presented. The ANFIS has the advantages of expert knowledge of the fuzzy inference system and the learning capability of neural networks. A hybrid learning algorithm, which combines the least square method and the back propagation algorithm, is used to identify the parameters of the ANFIS. The rotor position and the phase current of the 6/4 pole SRM are used to predict the phase inductance. The phase inductance results predicted by the ANFIS are in excellent agreement with the results of the finite element method
Role of Utility and Inference in the Evolution of Functional Information
Sharov, Alexei A.
2009-01-01
Functional information means an encoded network of functions in living organisms from molecular signaling pathways to an organism’s behavior. It is represented by two components: code and an interpretation system, which together form a self-sustaining semantic closure. Semantic closure allows some freedom between components because small variations of the code are still interpretable. The interpretation system consists of inference rules that control the correspondence between the code and the function (phenotype) and determines the shape of the fitness landscape. The utility factor operates at multiple time scales: short-term selection drives evolution towards higher survival and reproduction rate within a given fitness landscape, and long-term selection favors those fitness landscapes that support adaptability and lead to evolutionary expansion of certain lineages. Inference rules make short-term selection possible by shaping the fitness landscape and defining possible directions of evolution, but they are under control of the long-term selection of lineages. Communication normally occurs within a set of agents with compatible interpretation systems, which I call communication system. Functional information cannot be directly transferred between communication systems with incompatible inference rules. Each biological species is a genetic communication system that carries unique functional information together with inference rules that determine evolutionary directions and constraints. This view of the relation between utility and inference can resolve the conflict between realism/positivism and pragmatism. Realism overemphasizes the role of inference in evolution of human knowledge because it assumes that logic is embedded in reality. Pragmatism substitutes usefulness for truth and therefore ignores the advantage of inference. The proposed concept of evolutionary pragmatism rejects the idea that logic is embedded in reality; instead, inference rules are
Revealing networks from dynamics: an introduction
International Nuclear Information System (INIS)
Timme, Marc; Casadiego, Jose
2014-01-01
What can we learn from the collective dynamics of a complex network about its interaction topology? Taking the perspective from nonlinear dynamics, we briefly review recent progress on how to infer structural connectivity (direct interactions) from accessing the dynamics of the units. Potential applications range from interaction networks in physics, to chemical and metabolic reactions, protein and gene regulatory networks as well as neural circuits in biology and electric power grids or wireless sensor networks in engineering. Moreover, we briefly mention some standard ways of inferring effective or functional connectivity. (topical review)
A formal model of interpersonal inference
Directory of Open Access Journals (Sweden)
Michael eMoutoussis
2014-03-01
Full Text Available Introduction: We propose that active Bayesian inference – a general framework for decision-making – can equally be applied to interpersonal exchanges. Social cognition, however, entails special challenges. We address these challenges through a novel formulation of a formal model and demonstrate its psychological significance. Method: We review relevant literature, especially with regards to interpersonal representations, formulate a mathematical model and present a simulation study. The model accommodates normative models from utility theory and places them within the broader setting of Bayesian inference. Crucially, we endow people's prior beliefs, into which utilities are absorbed, with preferences of self and others. The simulation illustrates the model's dynamics and furnishes elementary predictions of the theory. Results: 1. Because beliefs about self and others inform both the desirability and plausibility of outcomes, in this framework interpersonal representations become beliefs that have to be actively inferred. This inference, akin to 'mentalising' in the psychological literature, is based upon the outcomes of interpersonal exchanges. 2. We show how some well-known social-psychological phenomena (e.g. self-serving biases can be explained in terms of active interpersonal inference. 3. Mentalising naturally entails Bayesian updating of how people value social outcomes. Crucially this includes inference about one’s own qualities and preferences. Conclusion: We inaugurate a Bayes optimal framework for modelling intersubject variability in mentalising during interpersonal exchanges. Here, interpersonal representations are endowed with explicit functional and affective properties. We suggest the active inference framework lends itself to the study of psychiatric conditions where mentalising is distorted.
Fröhlich, H.; Klau, G.W.
2013-01-01
Bayesian Networks are an established computational approach for data driven network inference. However, experimental data is limited in its availability and corrupted by noise. This leads to an unavoidable uncertainty about the correct network structure. Thus sampling or bootstrap based strategies
Inferring Mathematical Equations Using Crowdsourcing.
Directory of Open Access Journals (Sweden)
Szymon Wasik
Full Text Available Crowdsourcing, understood as outsourcing work to a large network of people in the form of an open call, has been utilized successfully many times, including a very interesting concept involving the implementation of computer games with the objective of solving a scientific problem by employing users to play a game-so-called crowdsourced serious games. Our main objective was to verify whether such an approach could be successfully applied to the discovery of mathematical equations that explain experimental data gathered during the observation of a given dynamic system. Moreover, we wanted to compare it with an approach based on artificial intelligence that uses symbolic regression to find such formulae automatically. To achieve this, we designed and implemented an Internet game in which players attempt to design a spaceship representing an equation that models the observed system. The game was designed while considering that it should be easy to use for people without strong mathematical backgrounds. Moreover, we tried to make use of the collective intelligence observed in crowdsourced systems by enabling many players to collaborate on a single solution. The idea was tested on several hundred players playing almost 10,000 games and conducting a user opinion survey. The results prove that the proposed solution has very high potential. The function generated during weeklong tests was almost as precise as the analytical solution of the model of the system and, up to a certain complexity level of the formulae, it explained data better than the solution generated automatically by Eureqa, the leading software application for the implementation of symbolic regression. Moreover, we observed benefits of using crowdsourcing; the chain of consecutive solutions that led to the best solution was obtained by the continuous collaboration of several players.
Inferring Mathematical Equations Using Crowdsourcing.
Wasik, Szymon; Fratczak, Filip; Krzyskow, Jakub; Wulnikowski, Jaroslaw
2015-01-01
Crowdsourcing, understood as outsourcing work to a large network of people in the form of an open call, has been utilized successfully many times, including a very interesting concept involving the implementation of computer games with the objective of solving a scientific problem by employing users to play a game-so-called crowdsourced serious games. Our main objective was to verify whether such an approach could be successfully applied to the discovery of mathematical equations that explain experimental data gathered during the observation of a given dynamic system. Moreover, we wanted to compare it with an approach based on artificial intelligence that uses symbolic regression to find such formulae automatically. To achieve this, we designed and implemented an Internet game in which players attempt to design a spaceship representing an equation that models the observed system. The game was designed while considering that it should be easy to use for people without strong mathematical backgrounds. Moreover, we tried to make use of the collective intelligence observed in crowdsourced systems by enabling many players to collaborate on a single solution. The idea was tested on several hundred players playing almost 10,000 games and conducting a user opinion survey. The results prove that the proposed solution has very high potential. The function generated during weeklong tests was almost as precise as the analytical solution of the model of the system and, up to a certain complexity level of the formulae, it explained data better than the solution generated automatically by Eureqa, the leading software application for the implementation of symbolic regression. Moreover, we observed benefits of using crowdsourcing; the chain of consecutive solutions that led to the best solution was obtained by the continuous collaboration of several players.
Estimating uncertainty of inference for validation
Energy Technology Data Exchange (ETDEWEB)
Booker, Jane M [Los Alamos National Laboratory; Langenbrunner, James R [Los Alamos National Laboratory; Hemez, Francois M [Los Alamos National Laboratory; Ross, Timothy J [UNM
2010-09-30
We present a validation process based upon the concept that validation is an inference-making activity. This has always been true, but the association has not been as important before as it is now. Previously, theory had been confirmed by more data, and predictions were possible based on data. The process today is to infer from theory to code and from code to prediction, making the role of prediction somewhat automatic, and a machine function. Validation is defined as determining the degree to which a model and code is an accurate representation of experimental test data. Imbedded in validation is the intention to use the computer code to predict. To predict is to accept the conclusion that an observable final state will manifest; therefore, prediction is an inference whose goodness relies on the validity of the code. Quantifying the uncertainty of a prediction amounts to quantifying the uncertainty of validation, and this involves the characterization of uncertainties inherent in theory/models/codes and the corresponding data. An introduction to inference making and its associated uncertainty is provided as a foundation for the validation problem. A mathematical construction for estimating the uncertainty in the validation inference is then presented, including a possibility distribution constructed to represent the inference uncertainty for validation under uncertainty. The estimation of inference uncertainty for validation is illustrated using data and calculations from Inertial Confinement Fusion (ICF). The ICF measurements of neutron yield and ion temperature were obtained for direct-drive inertial fusion capsules at the Omega laser facility. The glass capsules, containing the fusion gas, were systematically selected with the intent of establishing a reproducible baseline of high-yield 10{sup 13}-10{sup 14} neutron output. The deuterium-tritium ratio in these experiments was varied to study its influence upon yield. This paper on validation inference is the
Comparison of Machine Learning Techniques in Inferring Phytoplankton Size Classes
Directory of Open Access Journals (Sweden)
Shuibo Hu
2018-03-01
Full Text Available The size of phytoplankton not only influences its physiology, metabolic rates and marine food web, but also serves as an indicator of phytoplankton functional roles in ecological and biogeochemical processes. Therefore, some algorithms have been developed to infer the synoptic distribution of phytoplankton cell size, denoted as phytoplankton size classes (PSCs, in surface ocean waters, by the means of remotely sensed variables. This study, using the NASA bio-Optical Marine Algorithm Data set (NOMAD high performance liquid chromatography (HPLC database, and satellite match-ups, aimed to compare the effectiveness of modeling techniques, including partial least square (PLS, artificial neural networks (ANN, support vector machine (SVM and random forests (RF, and feature selection techniques, including genetic algorithm (GA, successive projection algorithm (SPA and recursive feature elimination based on support vector machine (SVM-RFE, for inferring PSCs from remote sensing data. Results showed that: (1 SVM-RFE worked better in selecting sensitive features; (2 RF performed better than PLS, ANN and SVM in calibrating PSCs retrieval models; (3 machine learning techniques produced better performance than the chlorophyll-a based three-component method; (4 sea surface temperature, wind stress, and spectral curvature derived from the remote sensing reflectance at 490, 510, and 555 nm were among the most sensitive features to PSCs; and (5 the combination of SVM-RFE feature selection techniques and random forests regression was recommended for inferring PSCs. This study demonstrated the effectiveness of machine learning techniques in selecting sensitive features and calibrating models for PSCs estimations with remote sensing.
Goal inferences about robot behavior : goal inferences and human response behaviors
Broers, H.A.T.; Ham, J.R.C.; Broeders, R.; De Silva, P.; Okada, M.
2014-01-01
This explorative research focused on the goal inferences human observers draw based on a robot's behavior, and the extent to which those inferences predict people's behavior in response to that robot. Results show that different robot behaviors cause different response behavior from people.
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
DEFF Research Database (Denmark)
Andersen, Kasper Winther
Three main topics are presented in this thesis. The first and largest topic concerns network modelling of functional Magnetic Resonance Imaging (fMRI) and Diffusion Weighted Imaging (DWI). In particular nonparametric Bayesian methods are used to model brain networks derived from resting state f...... for their ability to reproduce node clustering and predict unseen data. Comparing the models on whole brain networks, BCD and IRM showed better reproducibility and predictability than IDM, suggesting that resting state networks exhibit community structure. This also points to the importance of using models, which...... allow for complex interactions between all pairs of clusters. In addition, it is demonstrated how the IRM can be used for segmenting brain structures into functionally coherent clusters. A new nonparametric Bayesian network model is presented. The model builds upon the IRM and can be used to infer...
An integrative approach to inferring biologically meaningful gene modules
Directory of Open Access Journals (Sweden)
Wang Kai
2011-07-01
Full Text Available Abstract Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
Multivariate Statistical Inference of Lightning Occurrence, and Using Lightning Observations
Boccippio, Dennis
2004-01-01
Two classes of multivariate statistical inference using TRMM Lightning Imaging Sensor, Precipitation Radar, and Microwave Imager observation are studied, using nonlinear classification neural networks as inferential tools. The very large and globally representative data sample provided by TRMM allows both training and validation (without overfitting) of neural networks with many degrees of freedom. In the first study, the flashing / or flashing condition of storm complexes is diagnosed using radar, passive microwave and/or environmental observations as neural network inputs. The diagnostic skill of these simple lightning/no-lightning classifiers can be quite high, over land (above 80% Probability of Detection; below 20% False Alarm Rate). In the second, passive microwave and lightning observations are used to diagnose radar reflectivity vertical structure. A priori diagnosis of hydrometeor vertical structure is highly important for improved rainfall retrieval from either orbital radars (e.g., the future Global Precipitation Mission "mothership") or radiometers (e.g., operational SSM/I and future Global Precipitation Mission passive microwave constellation platforms), we explore the incremental benefit to such diagnosis provided by lightning observations.
Explanatory Preferences Shape Learning and Inference.
Lombrozo, Tania
2016-10-01
Explanations play an important role in learning and inference. People often learn by seeking explanations, and they assess the viability of hypotheses by considering how well they explain the data. An emerging body of work reveals that both children and adults have strong and systematic intuitions about what constitutes a good explanation, and that these explanatory preferences have a systematic impact on explanation-based processes. In particular, people favor explanations that are simple and broad, with the consequence that engaging in explanation can shape learning and inference by leading people to seek patterns and favor hypotheses that support broad and simple explanations. Given the prevalence of explanation in everyday cognition, understanding explanation is therefore crucial to understanding learning and inference. Copyright © 2016 Elsevier Ltd. All rights reserved.
Fuzzy logic controller using different inference methods
International Nuclear Information System (INIS)
Liu, Z.; De Keyser, R.
1994-01-01
In this paper the design of fuzzy controllers by using different inference methods is introduced. Configuration of the fuzzy controllers includes a general rule-base which is a collection of fuzzy PI or PD rules, the triangular fuzzy data model and a centre of gravity defuzzification algorithm. The generalized modus ponens (GMP) is used with the minimum operator of the triangular norm. Under the sup-min inference rule, six fuzzy implication operators are employed to calculate the fuzzy look-up tables for each rule base. The performance is tested in simulated systems with MATLAB/SIMULINK. Results show the effects of using the fuzzy controllers with different inference methods and applied to different test processes
Uncertainty in prediction and in inference
International Nuclear Information System (INIS)
Hilgevoord, J.; Uffink, J.
1991-01-01
The concepts of uncertainty in prediction and inference are introduced and illustrated using the diffraction of light as an example. The close relationship between the concepts of uncertainty in inference and resolving power is noted. A general quantitative measure of uncertainty in inference can be obtained by means of the so-called statistical distance between probability distributions. When applied to quantum mechanics, this distance leads to a measure of the distinguishability of quantum states, which essentially is the absolute value of the matrix element between the states. The importance of this result to the quantum mechanical uncertainty principle is noted. The second part of the paper provides a derivation of the statistical distance on the basis of the so-called method of support
A Learning Algorithm for Multimodal Grammar Inference.
D'Ulizia, A; Ferri, F; Grifoni, P
2011-12-01
The high costs of development and maintenance of multimodal grammars in integrating and understanding input in multimodal interfaces lead to the investigation of novel algorithmic solutions in automating grammar generation and in updating processes. Many algorithms for context-free grammar inference have been developed in the natural language processing literature. An extension of these algorithms toward the inference of multimodal grammars is necessary for multimodal input processing. In this paper, we propose a novel grammar inference mechanism that allows us to learn a multimodal grammar from its positive samples of multimodal sentences. The algorithm first generates the multimodal grammar that is able to parse the positive samples of sentences and, afterward, makes use of two learning operators and the minimum description length metrics in improving the grammar description and in avoiding the over-generalization problem. The experimental results highlight the acceptable performances of the algorithm proposed in this paper since it has a very high probability of parsing valid sentences.
Examples in parametric inference with R
Dixit, Ulhas Jayram
2016-01-01
This book discusses examples in parametric inference with R. Combining basic theory with modern approaches, it presents the latest developments and trends in statistical inference for students who do not have an advanced mathematical and statistical background. The topics discussed in the book are fundamental and common to many fields of statistical inference and thus serve as a point of departure for in-depth study. The book is divided into eight chapters: Chapter 1 provides an overview of topics on sufficiency and completeness, while Chapter 2 briefly discusses unbiased estimation. Chapter 3 focuses on the study of moments and maximum likelihood estimators, and Chapter 4 presents bounds for the variance. In Chapter 5, topics on consistent estimator are discussed. Chapter 6 discusses Bayes, while Chapter 7 studies some more powerful tests. Lastly, Chapter 8 examines unbiased and other tests. Senior undergraduate and graduate students in statistics and mathematics, and those who have taken an introductory cou...
Grammatical inference algorithms, routines and applications
Wieczorek, Wojciech
2017-01-01
This book focuses on grammatical inference, presenting classic and modern methods of grammatical inference from the perspective of practitioners. To do so, it employs the Python programming language to present all of the methods discussed. Grammatical inference is a field that lies at the intersection of multiple disciplines, with contributions from computational linguistics, pattern recognition, machine learning, computational biology, formal learning theory and many others. Though the book is largely practical, it also includes elements of learning theory, combinatorics on words, the theory of automata and formal languages, plus references to real-world problems. The listings presented here can be directly copied and pasted into other programs, thus making the book a valuable source of ready recipes for students, academic researchers, and programmers alike, as well as an inspiration for their further development.>.
Statistical inference based on divergence measures
Pardo, Leandro
2005-01-01
The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
Spreading paths in partially observed social networks
Onnela, Jukka-Pekka; Christakis, Nicholas A.
2012-01-01
Understanding how and how far information, behaviors, or pathogens spread in social networks is an important problem, having implications for both predicting the size of epidemics, as well as for planning effective interventions. There are, however, two main challenges for inferring spreading paths in real-world networks. One is the practical difficulty of observing a dynamic process on a network, and the other is the typical constraint of only partially observing a network. Using a static, s...
ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM FOR END MILLING
Directory of Open Access Journals (Sweden)
ANGELOS P. MARKOPOULOS
2016-09-01
Full Text Available Soft computing is commonly used as a modelling method in various technological areas. Methods such as Artificial Neural Networks and Fuzzy Logic have found application in manufacturing technology as well. NeuroFuzzy systems, aimed to combine the benefits of both the aforementioned Artificial Intelligence methods, are a subject of research lately as have proven to be superior compared to other methods. In this paper an adaptive neuro-fuzzy inference system for the prediction of surface roughness in end milling is presented. Spindle speed, feed rate, depth of cut and vibrations were used as independent input variables, while roughness parameter Ra as dependent output variable. Several variations are tested and the results of the optimum system are presented. Final results indicate that the proposed model can accurately predict surface roughness, even for input that was not used in training.