A Robust Method for Inferring Network Structures.
Yang, Yang; Luo, Tingjin; Li, Zhoujun; Zhang, Xiaoming; Yu, Philip S
2017-07-12
Inferring the network structure from limited observable data is significant in molecular biology, communication and many other areas. It is challenging, primarily because the observable data are sparse, finite and noisy. The development of machine learning and network structure study provides a great chance to solve the problem. In this paper, we propose an iterative smoothing algorithm with structure sparsity (ISSS) method. The elastic penalty in the model is introduced for the sparse solution, identifying group features and avoiding over-fitting, and the total variation (TV) penalty in the model can effectively utilize the structure information to identify the neighborhood of the vertices. Due to the non-smoothness of the elastic and structural TV penalties, an efficient algorithm with the Nesterov's smoothing optimization technique is proposed to solve the non-smooth problem. The experimental results on both synthetic and real-world networks show that the proposed model is robust against insufficient data and high noise. In addition, we investigate many factors that play important roles in identifying the performance of ISSS.
An algebra-based method for inferring gene regulatory networks.
Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard
2014-03-26
The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the
Towards a logic-based method to infer provenance-aware molecular networks
Aslaoui-Errafi, Zahira; Cohen-Boulakia, Sarah; Froidevaux, Christine; Gloaguen, Pauline; Poupon, Anne; Rougny, Adrien; Yahiaoui, Meriem
2012-01-01
International audience; Providing techniques to automatically infer molecular networks is particularly important to understand complex relationships between biological objects. We present a logic-based method to infer such networks and show how it allows inferring signalling networks from the design of a knowledge base. Provenance of inferred data has been carefully collected, allowing quality evaluation. More precisely, our method (i) takes into account various kinds of biological experiment...
Assessment of network inference methods: how to cope with an underdetermined problem.
Directory of Open Access Journals (Sweden)
Caroline Siegenthaler
Full Text Available The inference of biological networks is an active research area in the field of systems biology. The number of network inference algorithms has grown tremendously in the last decade, underlining the importance of a fair assessment and comparison among these methods. Current assessments of the performance of an inference method typically involve the application of the algorithm to benchmark datasets and the comparison of the network predictions against the gold standard or reference networks. While the network inference problem is often deemed underdetermined, implying that the inference problem does not have a (unique solution, the consequences of such an attribute have not been rigorously taken into consideration. Here, we propose a new procedure for assessing the performance of gene regulatory network (GRN inference methods. The procedure takes into account the underdetermined nature of the inference problem, in which gene regulatory interactions that are inferable or non-inferable are determined based on causal inference. The assessment relies on a new definition of the confusion matrix, which excludes errors associated with non-inferable gene regulations. For demonstration purposes, the proposed assessment procedure is applied to the DREAM 4 In Silico Network Challenge. The results show a marked change in the ranking of participating methods when taking network inferability into account.
Approximation Methods for Inference and Learning in Belief Networks: Progress and Future Directions
National Research Council Canada - National Science Library
Pazzan, Michael
1997-01-01
.... In this research project, we have investigated methods and implemented algorithms for efficiently making certain classes of inference in belief networks, and for automatically learning certain...
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Directory of Open Access Journals (Sweden)
Ji Wei
2010-10-01
Full Text Available Abstract Background Microarray data discretization is a basic preprocess for many algorithms of gene regulatory network inference. Some common discretization methods in informatics are used to discretize microarray data. Selection of the discretization method is often arbitrary and no systematic comparison of different discretization has been conducted, in the context of gene regulatory network inference from time series gene expression data. Results In this study, we propose a new discretization method "bikmeans", and compare its performance with four other widely-used discretization methods using different datasets, modeling algorithms and number of intervals. Sensitivities, specificities and total accuracies were calculated and statistical analysis was carried out. Bikmeans method always gave high total accuracies. Conclusions Our results indicate that proper discretization methods can consistently improve gene regulatory network inference independent of network modeling algorithms and datasets. Our new method, bikmeans, resulted in significant better total accuracies than other methods.
Bayesian Computation Methods for Inferring Regulatory Network Models Using Biomedical Data.
Tian, Tianhai
2016-01-01
The rapid advancement of high-throughput technologies provides huge amounts of information for gene expression and protein activity in the genome-wide scale. The availability of genomics, transcriptomics, proteomics, and metabolomics dataset gives an unprecedented opportunity to study detailed molecular regulations that is very important to precision medicine. However, it is still a significant challenge to design effective and efficient method to infer the network structure and dynamic property of regulatory networks. In recent years a number of computing methods have been designed to explore the regulatory mechanisms as well as estimate unknown model parameters. Among them, the Bayesian inference method can combine both prior knowledge and experimental data to generate updated information regarding the regulatory mechanisms. This chapter gives a brief review for Bayesian statistical methods that are used to infer the network structure and estimate model parameters based on experimental data.
A symmetry-based method to infer structural brain networks from probabilistic tractography data
Directory of Open Access Journals (Sweden)
Kamal Shadi
2016-11-01
Full Text Available Recent progress in diffusion MRI and tractography algorithms as well as the launch of the Human Connectome Project (HCP have provided brain research with an abundance of structural connectivity data. In this work, we describe and evaluate a method that can infer the structural brain network that interconnects a given set of Regions of Interest (ROIs from probabilistic tractography data. The proposed method, referred to as Minimum Asymmetry Network Inference Algorithm (MANIA, does not determine the connectivity between two ROIs based on an arbitrary connectivity threshold. Instead, we exploit a basic limitation of the tractography process: the observed streamlines from a source to a target do not provide any information about the polarity of the underlying white matter, and so if there are some fibers connecting two voxels (or two ROIs X and Y, tractography should be able in principle to follow this connection in both directions, from X to Y and from Y to X. We leverage this limitation to formulate the network inference process as an optimization problem that minimizes the (appropriately normalized asymmetry of the observed network. We evaluate the proposed method using both the FiberCup dataset and based on a noise model that randomly corrupts the observed connectivity of synthetic networks. As a case-study, we apply MANIA on diffusion MRI data from 28 healthy subjects to infer the structural network between 18 corticolimbic ROIs that are associated with various neuropsychiatric conditions including depression, anxiety and addiction.
Takemoto, Kazuhiro; Aie, Kazuki
2017-05-25
Host-pathogen interactions are important in a wide range of research fields. Given the importance of metabolic crosstalk between hosts and pathogens, a metabolic network-based reverse ecology method was proposed to infer these interactions. However, the validity of this method remains unclear because of the various explanations presented and the influence of potentially confounding factors that have thus far been neglected. We re-evaluated the importance of the reverse ecology method for evaluating host-pathogen interactions while statistically controlling for confounding effects using oxygen requirement, genome, metabolic network, and phylogeny data. Our data analyses showed that host-pathogen interactions were more strongly influenced by genome size, primary network parameters (e.g., number of edges), oxygen requirement, and phylogeny than the reserve ecology-based measures. These results indicate the limitations of the reverse ecology method; however, they do not discount the importance of adopting reverse ecology approaches altogether. Rather, we highlight the need for developing more suitable methods for inferring host-pathogen interactions and conducting more careful examinations of the relationships between metabolic networks and host-pathogen interactions.
Directory of Open Access Journals (Sweden)
Luke J Matthews
Full Text Available Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1 the absolute number of emails exchanged, (2 logistic regression probability of an offline relationship, and (3 the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a
Matthews, Luke J.; DeWan, Peter; Rula, Elizabeth Y.
2013-01-01
Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1) the absolute number of emails exchanged, (2) logistic regression probability of an offline relationship, and (3) the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI) across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a social network. PMID
Matthews, Luke J; DeWan, Peter; Rula, Elizabeth Y
2013-01-01
Studies of social networks, mapped using self-reported contacts, have demonstrated the strong influence of social connections on the propensity for individuals to adopt or maintain healthy behaviors and on their likelihood to adopt health risks such as obesity. Social network analysis may prove useful for businesses and organizations that wish to improve the health of their populations by identifying key network positions. Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data. The purpose of this study was to evaluate methods for using online communication data to generate comprehensive network maps that reproduce the health-associated properties of an offline social network. In this study, we examined three techniques for inferring social relationships from email traffic data in an employee population using thresholds based on: (1) the absolute number of emails exchanged, (2) logistic regression probability of an offline relationship, and (3) the highest ranked email exchange partners. As a model of the offline social network in the same population, a network map was created using social ties reported in a survey instrument. The email networks were evaluated based on the proportion of survey ties captured, comparisons of common network metrics, and autocorrelation of body mass index (BMI) across social ties. Results demonstrated that logistic regression predicted the greatest proportion of offline social ties, thresholding on number of emails exchanged produced the best match to offline network metrics, and ranked email partners demonstrated the strongest autocorrelation of BMI. Since each method had unique strengths, researchers should choose a method based on the aspects of offline behavior of interest. Ranked email partners may be particularly useful for purposes related to health traits in a social network.
Chen, Min; Yin, Xuezhi
2011-07-01
This paper descries a new non-invasive method for diagnosis of breathing disorders based on adaptive-network-based fuzzy inference system (ANFIS). In this method, PetCO2, SpO2 and HR are chosen as inputs, and the breathing condition is selected as output ofANFIS. The inputs and output are then classified into fuzzy subsets by experts' knowledge. After, the fuzzy IF-THEN rules are built up according to the corresponding membership functions by set up of fuzzy subsets. The neural network was finally established and the membership functions and fuzzy rules were optimized by training. The results of experiment shows that ANFIS is more effective than BP Network regarding the diagnosis of breathing disorders.
Inferring regulatory networks from expression data using tree-based methods.
Directory of Open Access Journals (Sweden)
Vân Anh Huynh-Thu
Full Text Available One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene is predicted from the expression patterns of all the other genes (input genes, using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
Directory of Open Access Journals (Sweden)
Daniel Lobo
2015-06-01
Full Text Available Transformative applications in biomedicine require the discovery of complex regulatory networks that explain the development and regeneration of anatomical structures, and reveal what external signals will trigger desired changes of large-scale pattern. Despite recent advances in bioinformatics, extracting mechanistic pathway models from experimental morphological data is a key open challenge that has resisted automation. The fundamental difficulty of manually predicting emergent behavior of even simple networks has limited the models invented by human scientists to pathway diagrams that show necessary subunit interactions but do not reveal the dynamics that are sufficient for complex, self-regulating pattern to emerge. To finally bridge the gap between high-resolution genetic data and the ability to understand and control patterning, it is critical to develop computational tools to efficiently extract regulatory pathways from the resultant experimental shape phenotypes. For example, planarian regeneration has been studied for over a century, but despite increasing insight into the pathways that control its stem cells, no constructive, mechanistic model has yet been found by human scientists that explains more than one or two key features of its remarkable ability to regenerate its correct anatomical pattern after drastic perturbations. We present a method to infer the molecular products, topology, and spatial and temporal non-linear dynamics of regulatory networks recapitulating in silico the rich dataset of morphological phenotypes resulting from genetic, surgical, and pharmacological experiments. We demonstrated our approach by inferring complete regulatory networks explaining the outcomes of the main functional regeneration experiments in the planarian literature; By analyzing all the datasets together, our system inferred the first systems-biology comprehensive dynamical model explaining patterning in planarian regeneration. This method
State of the Art of Fuzzy Methods for Gene Regulatory Networks Inference
Directory of Open Access Journals (Sweden)
Tuqyah Abdullah Al Qazlan
2015-01-01
Full Text Available To address one of the most challenging issues at the cellular level, this paper surveys the fuzzy methods used in gene regulatory networks (GRNs inference. GRNs represent causal relationships between genes that have a direct influence, trough protein production, on the life and the development of living organisms and provide a useful contribution to the understanding of the cellular functions as well as the mechanisms of diseases. Fuzzy systems are based on handling imprecise knowledge, such as biological information. They provide viable computational tools for inferring GRNs from gene expression data, thus contributing to the discovery of gene interactions responsible for specific diseases and/or ad hoc correcting therapies. Increasing computational power and high throughput technologies have provided powerful means to manage these challenging digital ecosystems at different levels from cell to society globally. The main aim of this paper is to report, present, and discuss the main contributions of this multidisciplinary field in a coherent and structured framework.
State of the Art of Fuzzy Methods for Gene Regulatory Networks Inference
Al Qazlan, Tuqyah Abdullah; Kara-Mohamed, Chafia
2015-01-01
To address one of the most challenging issues at the cellular level, this paper surveys the fuzzy methods used in gene regulatory networks (GRNs) inference. GRNs represent causal relationships between genes that have a direct influence, trough protein production, on the life and the development of living organisms and provide a useful contribution to the understanding of the cellular functions as well as the mechanisms of diseases. Fuzzy systems are based on handling imprecise knowledge, such as biological information. They provide viable computational tools for inferring GRNs from gene expression data, thus contributing to the discovery of gene interactions responsible for specific diseases and/or ad hoc correcting therapies. Increasing computational power and high throughput technologies have provided powerful means to manage these challenging digital ecosystems at different levels from cell to society globally. The main aim of this paper is to report, present, and discuss the main contributions of this multidisciplinary field in a coherent and structured framework. PMID:25879048
A graphical user interface for a method to infer kinetics and network architecture (MIKANA.
Directory of Open Access Journals (Sweden)
Márcio A Mourão
Full Text Available One of the main challenges in the biomedical sciences is the determination of reaction mechanisms that constitute a biochemical pathway. During the last decades, advances have been made in building complex diagrams showing the static interactions of proteins. The challenge for systems biologists is to build realistic models of the dynamical behavior of reactants, intermediates and products. For this purpose, several methods have been recently proposed to deduce the reaction mechanisms or to estimate the kinetic parameters of the elementary reactions that constitute the pathway. One such method is MIKANA: Method to Infer Kinetics And Network Architecture. MIKANA is a computational method to infer both reaction mechanisms and estimate the kinetic parameters of biochemical pathways from time course data. To make it available to the scientific community, we developed a Graphical User Interface (GUI for MIKANA. Among other features, the GUI validates and processes an input time course data, displays the inferred reactions, generates the differential equations for the chemical species in the pathway and plots the prediction curves on top of the input time course data. We also added a new feature to MIKANA that allows the user to exclude a priori known reactions from the inferred mechanism. This addition improves the performance of the method. In this article, we illustrate the GUI for MIKANA with three examples: an irreversible Michaelis-Menten reaction mechanism; the interaction map of chemical species of the muscle glycolytic pathway; and the glycolytic pathway of Lactococcus lactis. We also describe the code and methods in sufficient detail to allow researchers to further develop the code or reproduce the experiments described. The code for MIKANA is open source, free for academic and non-academic use and is available for download (Information S1.
Inferring network structure from cascades
Ghonge, Sushrut; Vural, Dervis Can
2017-07-01
Many physical, biological, and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we offer three topological methods to infer the structure of any directed network given a set of cascade arrival times. Our formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for several different cascade models.
Inference of directed climate networks: role of instability of causality estimation methods
Hlinka, Jaroslav; Hartman, David; Vejmelka, Martin; Paluš, Milan
2013-04-01
Climate data are increasingly analyzed by complex network analysis methods, including graph-theoretical approaches [1]. For such analysis, links between localized nodes of climate network are typically quantified by some statistical measures of dependence (connectivity) between measured variables of interest. To obtain information on the directionality of the interactions in the networks, a wide range of methods exists. These can be broadly divided into linear and nonlinear methods, with some of the latter having the theoretical advantage of being model-free, and principally a generalization of the former [2]. However, as a trade-off, this generality comes together with lower accuracy - in particular if the system was close to linear. In an overall stationary system, this may potentially lead to higher variability in the nonlinear network estimates. Therefore, with the same control of false alarms, this may lead to lower sensitivity for detection of real changes in the network structure. These problems are discussed on the example of daily SAT and SLP data from the NCEP/NCAR reanalysis dataset. We first reduce the dimensionality of data using PCA with VARIMAX rotation to detect several dozens of components that together explain most of the data variability. We further construct directed climate networks applying a selection of most widely used methods - variants of linear Granger causality and conditional mutual information. Finally, we assess the stability of the detected directed climate networks by computing them in sliding time windows. To understand the origin of the observed instabilities and their range, we also apply the same procedure to two types of surrogate data: either with non-stationarity in network structure removed, or imposed in a controlled way. In general, the linear methods show stable results in terms of overall similarity of directed climate networks inferred. For instance, for different decades of SAT data, the Spearman correlation of edge
Qin, Jing; Hu, Yaohua; Xu, Feng; Yalamanchili, Hari Krishna; Wang, Junwen
2014-06-01
Inferring gene regulatory networks from gene expression data at whole genome level is still an arduous challenge, especially in higher organisms where the number of genes is large but the number of experimental samples is small. It is reported that the accuracy of current methods at genome scale significantly drops from Escherichia coli to Saccharomyces cerevisiae due to the increase in number of genes. This limits the applicability of current methods to more complex genomes, like human and mouse. Least absolute shrinkage and selection operator (LASSO) is widely used for gene regulatory network inference from gene expression profiles. However, the accuracy of LASSO on large genomes is not satisfactory. In this study, we apply two extended models of LASSO, L0 and L1/2 regularization models to infer gene regulatory network from both high-throughput gene expression data and transcription factor binding data in mouse embryonic stem cells (mESCs). We find that both the L0 and L1/2 regularization models significantly outperform LASSO in network inference. Incorporating interactions between transcription factors and their targets remarkably improved the prediction accuracy. Current study demonstrates the efficiency and applicability of these two models for gene regulatory network inference from integrative omics data in large genomes. The applications of the two models will facilitate biologists to study the gene regulation of higher model organisms in a genome-wide scale. Copyright © 2014 Elsevier Inc. All rights reserved.
Noma, Hisashi; Nagashima, Kengo; Maruo, Kazushi; Gosho, Masahiko; Furukawa, Toshi A
2017-12-18
In network meta-analyses that synthesize direct and indirect comparison evidence concerning multiple treatments, multivariate random effects models have been routinely used for addressing between-studies heterogeneities. Although their standard inference methods depend on large sample approximations (eg, restricted maximum likelihood estimation) for the number of trials synthesized, the numbers of trials are often moderate or small. In these situations, standard estimators cannot be expected to behave in accordance with asymptotic theory; in particular, confidence intervals cannot be assumed to exhibit their nominal coverage probabilities (also, the type I error probabilities of the corresponding tests cannot be retained). The invalidity issue may seriously influence the overall conclusions of network meta-analyses. In this article, we develop several improved inference methods for network meta-analyses to resolve these problems. We first introduce 2 efficient likelihood-based inference methods, the likelihood ratio test-based and efficient score test-based methods, in a general framework of network meta-analysis. Then, to improve the small-sample inferences, we developed improved higher-order asymptotic methods using Bartlett-type corrections and bootstrap adjustment methods. The proposed methods adopt Monte Carlo approaches using parametric bootstraps to effectively circumvent complicated analytical calculations of case-by-case analyses and to permit flexible application to various statistical models network meta-analyses. These methods can also be straightforwardly applied to multivariate meta-regression analyses and to tests for the evaluation of inconsistency. In numerical evaluations via simulations, the proposed methods generally performed well compared with the ordinary restricted maximum likelihood-based inference method. Applications to 2 network meta-analysis datasets are provided. Copyright © 2017 John Wiley & Sons, Ltd.
A new method to infer causal phenotype networks using QTL and phenotypic information.
Directory of Open Access Journals (Sweden)
Huange Wang
Full Text Available In the context of genetics and breeding research on multiple phenotypic traits, reconstructing the directional or causal structure between phenotypic traits is a prerequisite for quantifying the effects of genetic interventions on the traits. Current approaches mainly exploit the genetic effects at quantitative trait loci (QTLs to learn about causal relationships among phenotypic traits. A requirement for using these approaches is that at least one unique QTL has been identified for each trait studied. However, in practice, especially for molecular phenotypes such as metabolites, this prerequisite is often not met due to limited sample sizes, high noise levels and small QTL effects. Here, we present a novel heuristic search algorithm called the QTL+phenotype supervised orientation (QPSO algorithm to infer causal directions for edges in undirected phenotype networks. The two main advantages of this algorithm are: first, it does not require QTLs for each and every trait; second, it takes into account associated phenotypic interactions in addition to detected QTLs when orienting undirected edges between traits. We evaluate and compare the performance of QPSO with another state-of-the-art approach, the QTL-directed dependency graph (QDG algorithm. Simulation results show that our method has broader applicability and leads to more accurate overall orientations. We also illustrate our method with a real-life example involving 24 metabolites and a few major QTLs measured on an association panel of 93 tomato cultivars. Matlab source code implementing the proposed algorithm is freely available upon request.
A new method to infer causal phenotype networks using QTL and phenotypic information.
Wang, Huange; van Eeuwijk, Fred A
2014-01-01
In the context of genetics and breeding research on multiple phenotypic traits, reconstructing the directional or causal structure between phenotypic traits is a prerequisite for quantifying the effects of genetic interventions on the traits. Current approaches mainly exploit the genetic effects at quantitative trait loci (QTLs) to learn about causal relationships among phenotypic traits. A requirement for using these approaches is that at least one unique QTL has been identified for each trait studied. However, in practice, especially for molecular phenotypes such as metabolites, this prerequisite is often not met due to limited sample sizes, high noise levels and small QTL effects. Here, we present a novel heuristic search algorithm called the QTL+phenotype supervised orientation (QPSO) algorithm to infer causal directions for edges in undirected phenotype networks. The two main advantages of this algorithm are: first, it does not require QTLs for each and every trait; second, it takes into account associated phenotypic interactions in addition to detected QTLs when orienting undirected edges between traits. We evaluate and compare the performance of QPSO with another state-of-the-art approach, the QTL-directed dependency graph (QDG) algorithm. Simulation results show that our method has broader applicability and leads to more accurate overall orientations. We also illustrate our method with a real-life example involving 24 metabolites and a few major QTLs measured on an association panel of 93 tomato cultivars. Matlab source code implementing the proposed algorithm is freely available upon request.
Inferring network topology via the propagation process
Zeng, An
2013-01-01
Inferring the network topology from the dynamics is a fundamental problem with wide applications in geology, biology and even counter-terrorism. Based on the propagation process, we present a simple method to uncover the network topology. The numerical simulation on artificial networks shows that our method enjoys a high accuracy in inferring the network topology. We find the infection rate in the propagation process significantly influences the accuracy, and each network is corresponding to an optimal infection rate. Moreover, the method generally works better in large networks. These finding are confirmed in both real social and nonsocial networks. Finally, the method is extended to directed networks and a similarity measure specific for directed networks is designed.
Directory of Open Access Journals (Sweden)
Yanzhu Hu
2016-09-01
Full Text Available Complex network methodology is very useful for complex system exploration. However, the relationships among variables in complex systems are usually not clear. Therefore, inferring association networks among variables from their observed data has been a popular research topic. We propose a method, named small-shuffle symbolic transfer entropy spectrum (SSSTES, for inferring association networks from multivariate time series. The method can solve four problems for inferring association networks, i.e., strong correlation identification, correlation quantification, direction identification and temporal relation identification. The method can be divided into four layers. The first layer is the so-called data layer. Data input and processing are the things to do in this layer. In the second layer, we symbolize the model data, original data and shuffled data, from the previous layer and calculate circularly transfer entropy with different time lags for each pair of time series variables. Thirdly, we compose transfer entropy spectrums for pairwise time series with the previous layer’s output, a list of transfer entropy matrix. We also identify the correlation level between variables in this layer. In the last layer, we build a weighted adjacency matrix, the value of each entry representing the correlation level between pairwise variables, and then get the weighted directed association network. Three sets of numerical simulated data from a linear system, a nonlinear system and a coupled Rossler system are used to show how the proposed approach works. Finally, we apply SSSTES to a real industrial system and get a better result than with two other methods.
National Research Council Canada - National Science Library
Matthews, Luke J; DeWan, Peter; Rula, Elizabeth Y
2013-01-01
.... Health traits have been shown to correlate across friendship ties, but evaluating network effects in large coworker populations presents the challenge of obtaining sufficiently comprehensive network data...
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees a...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....... and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last...
Incorporating existing network information into gene network inference.
Directory of Open Access Journals (Sweden)
Scott Christley
2009-08-01
Full Text Available One methodology that has met success to infer gene networks from gene expression data is based upon ordinary differential equations (ODE. However new types of data continue to be produced, so it is worthwhile to investigate how to integrate these new data types into the inference procedure. One such data is physical interactions between transcription factors and the genes they regulate as measured by ChIP-chip or ChIP-seq experiments. These interactions can be incorporated into the gene network inference procedure as a priori network information. In this article, we extend the ODE methodology into a general optimization framework that incorporates existing network information in combination with regularization parameters that encourage network sparsity. We provide theoretical results proving convergence of the estimator for our method and show the corresponding probabilistic interpretation also converges. We demonstrate our method on simulated network data and show that existing network information improves performance, overcomes the lack of observations, and performs well even when some of the existing network information is incorrect. We further apply our method to the core regulatory network of embryonic stem cells utilizing predicted interactions from two studies as existing network information. We show that including the prior network information constructs a more closely representative regulatory network versus when no information is provided.
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Addressing false discoveries in network inference.
Petri, Tobias; Altmann, Stefan; Geistlinger, Ludwig; Zimmer, Ralf; Küffner, Robert
2015-09-01
Experimentally determined gene regulatory networks can be enriched by computational inference from high-throughput expression profiles. However, the prediction of regulatory interactions is severely impaired by indirect and spurious effects, particularly for eukaryotes. Recently, published methods report improved predictions by exploiting the a priori known targets of a regulator (its local topology) in addition to expression profiles. We find that methods exploiting known targets show an unexpectedly high rate of false discoveries. This leads to inflated performance estimates and the prediction of an excessive number of new interactions for regulators with many known targets. These issues are hidden from common evaluation and cross-validation setups, which is due to Simpson's paradox. We suggest a confidence score recalibration method (CoRe) that reduces the false discovery rate and enables a reliable performance estimation. CoRe considerably improves the results of network inference methods that exploit known targets. Predictions then display the biological process specificity of regulators more correctly and enable the inference of accurate genome-wide regulatory networks in eukaryotes. For yeast, we propose a network with more than 22 000 confident interactions. We point out that machine learning approaches outside of the area of network inference may be affected as well. Results, executable code and networks are available via our website http://www.bio.ifi.lmu.de/forschung/CoRe. robert.kueffner@helmholtz-muenchen.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Directory of Open Access Journals (Sweden)
Ja’fari A.
2014-01-01
Full Text Available Image logs provide useful information for fracture study in naturally fractured reservoir. Fracture dip, azimuth, aperture and fracture density can be obtained from image logs and have great importance in naturally fractured reservoir characterization. Imaging all fractured parts of hydrocarbon reservoirs and interpreting the results is expensive and time consuming. In this study, an improved method to make a quantitative correlation between fracture densities obtained from image logs and conventional well log data by integration of different artificial intelligence systems was proposed. The proposed method combines the results of Adaptive Neuro-Fuzzy Inference System (ANFIS and Neural Networks (NN algorithms for overall estimation of fracture density from conventional well log data. A simple averaging method was used to obtain a better result by combining results of ANFIS and NN. The algorithm applied on other wells of the field to obtain fracture density. In order to model the fracture density in the reservoir, we used variography and sequential simulation algorithms like Sequential Indicator Simulation (SIS and Truncated Gaussian Simulation (TGS. The overall algorithm applied to Asmari reservoir one of the SW Iranian oil fields. Histogram analysis applied to control the quality of the obtained models. Results of this study show that for higher number of fracture facies the TGS algorithm works better than SIS but in small number of fracture facies both algorithms provide approximately same results.
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Inferring Phylogenetic Networks from Gene Order Data
Directory of Open Access Journals (Sweden)
Alexey Anatolievich Morozov
2013-01-01
Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.
Inferring network connectivity by delayed feedback control.
Directory of Open Access Journals (Sweden)
Dongchuan Yu
Full Text Available We suggest a control based approach to topology estimation of networks with N elements. This method first drives the network to steady states by a delayed feedback control; then performs structural perturbations for shifting the steady states M times; and finally infers the connection topology from the steady states' shifts by matrix inverse algorithm (M = N or l(1-norm convex optimization strategy applicable to estimate the topology of sparse networks from M << N perturbations. We discuss as well some aspects important for applications, such as the topology reconstruction quality and error sources, advantages and disadvantages of the suggested method, and the influence of (control perturbations, inhomegenity, sparsity, coupling functions, and measurement noise. Some examples of networks with Chua's oscillators are presented to illustrate the reliability of the suggested technique.
Optimization methods for logical inference
Chandru, Vijay
2011-01-01
Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating and ...
Quantum Enhanced Inference in Markov Logic Networks
Wittek, Peter; Gogolin, Christian
2017-04-01
Markov logic networks (MLNs) reconcile two opposing schools in machine learning and artificial intelligence: causal networks, which account for uncertainty extremely well, and first-order logic, which allows for formal deduction. An MLN is essentially a first-order logic template to generate Markov networks. Inference in MLNs is probabilistic and it is often performed by approximate methods such as Markov chain Monte Carlo (MCMC) Gibbs sampling. An MLN has many regular, symmetric structures that can be exploited at both first-order level and in the generated Markov network. We analyze the graph structures that are produced by various lifting methods and investigate the extent to which quantum protocols can be used to speed up Gibbs sampling with state preparation and measurement schemes. We review different such approaches, discuss their advantages, theoretical limitations, and their appeal to implementations. We find that a straightforward application of a recent result yields exponential speedup compared to classical heuristics in approximate probabilistic inference, thereby demonstrating another example where advanced quantum resources can potentially prove useful in machine learning.
Facility Activity Inference Using Radiation Networks
Energy Technology Data Exchange (ETDEWEB)
Rao, Nageswara S. [ORNL; Ramirez Aviles, Camila A. [ORNL
2017-11-01
We consider the problem of inferring the operational status of a reactor facility using measurements from a radiation sensor network deployed around the facility’s ventilation off-gas stack. The intensity of stack emissions decays with distance, and the sensor counts or measurements are inherently random with parameters determined by the intensity at the sensor’s location. We utilize the measurements to estimate the intensity at the stack, and use it in a one-sided Sequential Probability Ratio Test (SPRT) to infer on/off status of the reactor. We demonstrate the superior performance of this method over conventional majority fusers and individual sensors using (i) test measurements from a network of 21 NaI detectors, and (ii) effluence measurements collected at the stack of a reactor facility. We also analytically establish the superior detection performance of the network over individual sensors with fixed and adaptive thresholds by utilizing the Poisson distribution of the counts. We quantify the performance improvements of the network detection over individual sensors using the packing number of the intensity space.
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...... by evaluating and differentiating these circuits in time linear in their size. We report on experimental results showing successful compilation and efficient inference on relational Bayesian networks, whose PRIMULA--generated propositional instances have thousands of variables, and whose jointrees have clusters...
Inferring gene networks from discrete expression data
Zhang, L.
2013-07-18
The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.
A full bayesian approach for boolean genetic network inference.
Directory of Open Access Journals (Sweden)
Shengtong Han
Full Text Available Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data.
Efficient inference of overlapping communities in complex networks
DEFF Research Database (Denmark)
Fruergaard, Bjarne Ørum; Herlau, Tue
2014-01-01
We discuss two views on extending existing methods for complex network modeling which we dub the communities first and the networks first view, respectively. Inspired by the networks first view that we attribute to White, Boorman, and Breiger (1976)[1], we formulate the multiple-networks stochastic...... blockmodel (MNSBM), which seeks to separate the observed network into subnetworks of different types and where the problem of inferring structure in each subnetwork becomes easier. We show how this model is specified in a generative Bayesian framework where parameters can be inferred efficiently using Gibbs...
Inferring gene regression networks with model trees
Directory of Open Access Journals (Sweden)
Aguilar-Ruiz Jesus S
2010-10-01
Full Text Available Abstract Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear
Information-Theoretic Inference of Large Transcriptional Regulatory Networks
Directory of Open Access Journals (Sweden)
Meyer Patrick
2007-01-01
Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Information-Theoretic Inference of Large Transcriptional Regulatory Networks
Directory of Open Access Journals (Sweden)
Patrick E. Meyer
2007-06-01
Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.
Causal network inference using biochemical kinetics.
Oates, Chris J; Dondelinger, Frank; Bayani, Nora; Korkola, James; Gray, Joe W; Mukherjee, Sach
2014-09-01
Networks are widely used as structural summaries of biochemical systems. Statistical estimation of networks is usually based on linear or discrete models. However, the dynamics of biochemical systems are generally non-linear, suggesting that suitable non-linear formulations may offer gains with respect to causal network inference and aid in associated prediction problems. We present a general framework for network inference and dynamical prediction using time course data that is rooted in non-linear biochemical kinetics. This is achieved by considering a dynamical system based on a chemical reaction graph with associated kinetic parameters. Both the graph and kinetic parameters are treated as unknown; inference is carried out within a Bayesian framework. This allows prediction of dynamical behavior even when the underlying reaction graph itself is unknown or uncertain. Results, based on (i) data simulated from a mechanistic model of mitogen-activated protein kinase signaling and (ii) phosphoproteomic data from cancer cell lines, demonstrate that non-linear formulations can yield gains in causal network inference and permit dynamical prediction and uncertainty quantification in the challenging setting where the reaction graph is unknown. MATLAB R2014a software is available to download from warwick.ac.uk/chrisoates. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Wisdom of crowds for robust gene network inference.
Marbach, Daniel; Costello, James C; Küffner, Robert; Vega, Nicole M; Prill, Robert J; Camacho, Diogo M; Allison, Kyle R; Kellis, Manolis; Collins, James J; Stolovitzky, Gustavo
2012-07-15
Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ~1,700 transcriptional interactions at a precision of ~50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.
Inferring cell-scale signalling networks via compressive sensing.
Directory of Open Access Journals (Sweden)
Lei Nie
Full Text Available Signalling network inference is a central problem in system biology. Previous studies investigate this problem by independently inferring local signalling networks and then linking them together via crosstalk. Since a cellular signalling system is in fact indivisible, this reductionistic approach may have an impact on the accuracy of the inference results. Preferably, a cell-scale signalling network should be inferred as a whole. However, the holistic approach suffers from three practical issues: scalability, measurement and overfitting. Here we make this approach feasible based on two key observations: 1 variations of concentrations are sparse due to separations of timescales; 2 several species can be measured together using cross-reactivity. We propose a method, CCELL, for cell-scale signalling network inference from time series generated by immunoprecipitation using Bayesian compressive sensing. A set of benchmark networks with varying numbers of time-variant species is used to demonstrate the effectiveness of our method. Instead of exhaustively measuring all individual species, high accuracy is achieved from relatively few measurements.
Classifying pairs with trees for supervised biological network inference.
Schrynemackers, Marie; Wehenkel, Louis; Babu, M Madan; Geurts, Pierre
2015-08-01
Networks are ubiquitous in biology, and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes. Here, we systematically investigate, theoretically and empirically, the exploitation of tree-based ensemble methods in the context of these two approaches for biological network inference. We first formalize the problem of network inference as a classification of pairs, unifying in the process homogeneous and bipartite graphs and discussing two main sampling schemes. We then present the global and the local approaches, extending the latter for the prediction of interactions between two unseen network nodes, and discuss their specializations to tree-based ensemble methods, highlighting their interpretability and drawing links with clustering techniques. Extensive computational experiments are carried out with these methods on various biological networks that clearly highlight that these methods are competitive with existing methods.
Statistical inference via fiducial methods
Salomé, Diemer
1998-01-01
In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary
Sparse and compositionally robust inference of microbial ecological networks.
Directory of Open Access Journals (Sweden)
Zachary D Kurtz
2015-05-01
Full Text Available 16S ribosomal RNA (rRNA gene and other environmental sequencing techniques provide snapshots of microbial communities, revealing phylogeny and the abundances of microbial populations across diverse ecosystems. While changes in microbial community structure are demonstrably associated with certain environmental conditions (from metabolic and immunological health in mammals to ecological stability in soils and oceans, identification of underlying mechanisms requires new statistical tools, as these datasets present several technical challenges. First, the abundances of microbial operational taxonomic units (OTUs from amplicon-based datasets are compositional. Counts are normalized to the total number of counts in the sample. Thus, microbial abundances are not independent, and traditional statistical metrics (e.g., correlation for the detection of OTU-OTU relationships can lead to spurious results. Secondly, microbial sequencing-based studies typically measure hundreds of OTUs on only tens to hundreds of samples; thus, inference of OTU-OTU association networks is severely under-powered, and additional information (or assumptions are required for accurate inference. Here, we present SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference, a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. To reconstruct the network, SPIEC-EASI relies on algorithms for sparse neighborhood and inverse covariance selection. To provide a synthetic benchmark in the absence of an experimentally validated gold-standard network, SPIEC-EASI is accompanied by a set of computational tools to generate OTU count data from a set of diverse underlying network topologies
de Luis Balaguer, Maria Angels; Sozzani, Rosangela
2017-01-01
Gene regulatory network (GRN) models have been shown to predict and represent interactions among sets of genes. Here, we first show the basic steps to implement a simple but computationally efficient algorithm to infer GRNs based on dynamic Bayesian networks (DBNs), and we then explain how to approximate DBN-based GRN models with continuous models. In addition, we show a MATLAB implementation of the key steps of this method, which we use to infer an Arabidopsis root GRN.
Functional network inference of the suprachiasmatic nucleus
Energy Technology Data Exchange (ETDEWEB)
Abel, John H.; Meeker, Kirsten; Granados-Fuentes, Daniel; St. John, Peter C.; Wang, Thomas J.; Bales, Benjamin B.; Doyle, Francis J.; Herzog, Erik D.; Petzold, Linda R.
2016-04-04
In the mammalian suprachiasmatic nucleus (SCN), noisy cellular oscillators communicate within a neuronal network to generate precise system-wide circadian rhythms. Although the intracellular genetic oscillator and intercellular biochemical coupling mechanisms have been examined previously, the network topology driving synchronization of the SCN has not been elucidated. This network has been particularly challenging to probe, due to its oscillatory components and slow coupling timescale. In this work, we investigated the SCN network at a single-cell resolution through a chemically induced desynchronization. We then inferred functional connections in the SCN by applying the maximal information coefficient statistic to bioluminescence reporter data from individual neurons while they resynchronized their circadian cycling. Our results demonstrate that the functional network of circadian cells associated with resynchronization has small-world characteristics, with a node degree distribution that is exponential. We show that hubs of this small-world network are preferentially located in the central SCN, with sparsely connected shells surrounding these cores. Finally, we used two computational models of circadian neurons to validate our predictions of network structure.
Inference of causality in epidemics on temporal contact networks
Braunstein, Alfredo; Ingrosso, Alessandro
2016-06-01
Investigating into the past history of an epidemic outbreak is a paramount problem in epidemiology. Based on observations about the state of individuals, on the knowledge of the network of contacts and on a mathematical model for the epidemic process, the problem consists in describing some features of the posterior distribution of unobserved past events, such as the source, potential transmissions, and undetected positive cases. Several methods have been proposed for the study of these inference problems on discrete-time, synchronous epidemic models on networks, including naive Bayes, centrality measures, accelerated Monte-Carlo approaches and Belief Propagation. However, most traced real networks consist of short-time contacts on continuous time. A possibility that has been adopted is to discretize time line into identical intervals, a method that becomes more and more precise as the length of the intervals vanishes. Unfortunately, the computational time of the inference methods increase with the number of intervals, turning a sufficiently precise inference procedure often impractical. We show here an extension of the Belief Propagation method that is able to deal with a model of continuous-time events, without resorting to time discretization. We also investigate the effect of time discretization on the quality of the inference.
Adaptive moment closure for parameter inference of biochemical reaction networks.
Schilling, Christian; Bogomolov, Sergiy; Henzinger, Thomas A; Podelski, Andreas; Ruess, Jakob
2016-11-01
Continuous-time Markov chain (CTMC) models have become a central tool for understanding the dynamics of complex reaction networks and the importance of stochasticity in the underlying biochemical processes. When such models are employed to answer questions in applications, in order to ensure that the model provides a sufficiently accurate representation of the real system, it is of vital importance that the model parameters are inferred from real measured data. This, however, is often a formidable task and all of the existing methods fail in one case or the other, usually because the underlying CTMC model is high-dimensional and computationally difficult to analyze. The parameter inference methods that tend to scale best in the dimension of the CTMC are based on so-called moment closure approximations. However, there exists a large number of different moment closure approximations and it is typically hard to say a priori which of the approximations is the most suitable for the inference procedure. Here, we propose a moment-based parameter inference method that automatically chooses the most appropriate moment closure method. Accordingly, contrary to existing methods, the user is not required to be experienced in moment closure techniques. In addition to that, our method adaptively changes the approximation during the parameter inference to ensure that always the best approximation is used, even in cases where different approximations are best in different regions of the parameter space. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Inferring spatiotemporal network patterns from intracranial EEG data.
Ossadtchi, A; Greenblatt, R E; Towle, V L; Kohrman, M H; Kamada, K
2010-06-01
The characterization of spatial network dynamics is desirable for a better understanding of seizure physiology. The goal of this work is to develop a computational method for identifying transient spatial patterns from intracranial electroencephalographic (iEEG) data. Starting with bivariate synchrony measures, such as phase correlation, a two-step clustering procedure is used to identify statistically significant spatial network patterns, whose temporal evolution can be inferred. We refer to this as the composite synchrony profile (CSP) method. The CSP method was verified with simulated data and evaluated using ictal and interictal recordings from three patients with intractable epilepsy. Application of the CSP method to these clinical iEEG datasets revealed a set of distinct CSPs with topographies consistent with medial temporal/limbic and superior parietal/medial frontal networks thought to be involved in the seizure generation process. By combining relatively straightforward multivariate signal processing techniques, such as phase synchrony, with clustering and statistical hypothesis testing, the methods we describe may prove useful for network definition and identification. The network patterns we observe using the CSP method cannot be inferred from direct visual inspection of the raw time series data, nor are they apparent in voltage-based topographic map sequences. Copyright 2010 International Federation of Clinical Neurophysiology. All rights reserved.
Probabilistic logic networks a comprehensive framework for uncertain inference
Goertzel, Ben; Goertzel, Izabela Freire; Heljakka, Ari
2008-01-01
This comprehensive book describes Probabilistic Logic Networks (PLN), a novel conceptual, mathematical and computational approach to uncertain inference. A broad scope of reasoning types are considered.
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
MIDER: network inference with mutual information distance and entropy reduction.
Directory of Open Access Journals (Sweden)
Alejandro F Villaverde
Full Text Available The prediction of links among variables from a given dataset is a task referred to as network inference or reverse engineering. It is an open problem in bioinformatics and systems biology, as well as in other areas of science. Information theory, which uses concepts such as mutual information, provides a rigorous framework for addressing it. While a number of information-theoretic methods are already available, most of them focus on a particular type of problem, introducing assumptions that limit their generality. Furthermore, many of these methods lack a publicly available implementation. Here we present MIDER, a method for inferring network structures with information theoretic concepts. It consists of two steps: first, it provides a representation of the network in which the distance among nodes indicates their statistical closeness. Second, it refines the prediction of the existing links to distinguish between direct and indirect interactions and to assign directionality. The method accepts as input time-series data related to some quantitative features of the network nodes (such as e.g. concentrations, if the nodes are chemical species. It takes into account time delays between variables, and allows choosing among several definitions and normalizations of mutual information. It is general purpose: it may be applied to any type of network, cellular or otherwise. A Matlab implementation including source code and data is freely available (http://www.iim.csic.es/~gingproc/mider.html. The performance of MIDER has been evaluated on seven different benchmark problems that cover the main types of cellular networks, including metabolic, gene regulatory, and signaling. Comparisons with state of the art information-theoretic methods have demonstrated the competitive performance of MIDER, as well as its versatility. Its use does not demand any a priori knowledge from the user; the default settings and the adaptive nature of the method provide good
MIDER: network inference with mutual information distance and entropy reduction.
Villaverde, Alejandro F; Ross, John; Morán, Federico; Banga, Julio R
2014-01-01
The prediction of links among variables from a given dataset is a task referred to as network inference or reverse engineering. It is an open problem in bioinformatics and systems biology, as well as in other areas of science. Information theory, which uses concepts such as mutual information, provides a rigorous framework for addressing it. While a number of information-theoretic methods are already available, most of them focus on a particular type of problem, introducing assumptions that limit their generality. Furthermore, many of these methods lack a publicly available implementation. Here we present MIDER, a method for inferring network structures with information theoretic concepts. It consists of two steps: first, it provides a representation of the network in which the distance among nodes indicates their statistical closeness. Second, it refines the prediction of the existing links to distinguish between direct and indirect interactions and to assign directionality. The method accepts as input time-series data related to some quantitative features of the network nodes (such as e.g. concentrations, if the nodes are chemical species). It takes into account time delays between variables, and allows choosing among several definitions and normalizations of mutual information. It is general purpose: it may be applied to any type of network, cellular or otherwise. A Matlab implementation including source code and data is freely available (http://www.iim.csic.es/~gingproc/mider.html). The performance of MIDER has been evaluated on seven different benchmark problems that cover the main types of cellular networks, including metabolic, gene regulatory, and signaling. Comparisons with state of the art information-theoretic methods have demonstrated the competitive performance of MIDER, as well as its versatility. Its use does not demand any a priori knowledge from the user; the default settings and the adaptive nature of the method provide good results for a wide
Inferring Directed Road Networks from GPS Traces by Track Alignment
Directory of Open Access Journals (Sweden)
Xingzhe Xie
2015-11-01
Full Text Available This paper proposes a method to infer road networks from GPS traces. These networks include intersections between roads, the connectivity between the intersections and the possible traffic directions between directly-connected intersections. These intersections are localized by detecting and clustering turning points, which are locations where the moving direction changes on GPS traces. We infer the structure of road networks by segmenting all of the GPS traces to identify these intersections. We can then form both a connectivity matrix of the intersections and a small representative GPS track for each road segment. The road segment between each pair of directly-connected intersections is represented using a series of geographical locations, which are averaged from all of the tracks on this road segment by aligning them using the dynamic time warping (DTW algorithm. Our contribution is two-fold. First, we detect potential intersections by clustering the turning points on the GPS traces. Second, we infer the geometry of the road segments between intersections by aligning GPS tracks point by point using a “stretch and then compress” strategy based on the DTW algorithm. This approach not only allows road estimation by averaging the aligned tracks, but also a deeper statistical analysis based on the individual track’s time alignment, for example the variance of speed along a road segment.
Composite likelihood method for inferring local pedigrees
DEFF Research Database (Denmark)
Ko, Amy; Nielsen, Rasmus
2017-01-01
Pedigrees contain information about the genealogical relationships among individuals and are of fundamental importance in many areas of genetic studies. However, pedigrees are often unknown and must be inferred from genetic data. Despite the importance of pedigree inference, existing methods...... are limited to inferring only close relationships or analyzing a small number of individuals or loci. We present a simulated annealing method for estimating pedigrees in large samples of otherwise seemingly unrelated individuals using genome-wide SNP data. The method supports complex pedigree structures...... such as polygamous families, multi-generational families, and pedigrees in which many of the member individuals are missing. Computational speed is greatly enhanced by the use of a composite likelihood function which approximates the full likelihood. We validate our method on simulated data and show that it can...
Goyal, Ravi; De Gruttola, Victor
2018-01-30
Analysis of sexual history data intended to describe sexual networks presents many challenges arising from the fact that most surveys collect information on only a very small fraction of the population of interest. In addition, partners are rarely identified and responses are subject to reporting biases. Typically, each network statistic of interest, such as mean number of sexual partners for men or women, is estimated independently of other network statistics. There is, however, a complex relationship among networks statistics; and knowledge of these relationships can aid in addressing concerns mentioned earlier. We develop a novel method that constrains a posterior predictive distribution of a collection of network statistics in order to leverage the relationships among network statistics in making inference about network properties of interest. The method ensures that inference on network properties is compatible with an actual network. Through extensive simulation studies, we also demonstrate that use of this method can improve estimates in settings where there is uncertainty that arises both from sampling and from systematic reporting bias compared with currently available approaches to estimation. To illustrate the method, we apply it to estimate network statistics using data from the Chicago Health and Social Life Survey. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Order statistics & inference estimation methods
Balakrishnan, N
1991-01-01
The literature on order statistics and inferenc eis quite extensive and covers a large number of fields ,but most of it is dispersed throughout numerous publications. This volume is the consolidtion of the most important results and places an emphasis on estimation. Both theoretical and computational procedures are presented to meet the needs of researchers, professionals, and students. The methods of estimation discussed are well-illustrated with numerous practical examples from both the physical and life sciences, including sociology,psychology,a nd electrical and chemical engineering. A co
Inferring correlation networks from genomic survey data.
Directory of Open Access Journals (Sweden)
Jonathan Friedman
Full Text Available High-throughput sequencing based techniques, such as 16S rRNA gene profiling, have the potential to elucidate the complex inner workings of natural microbial communities - be they from the world's oceans or the human gut. A key step in exploring such data is the identification of dependencies between members of these communities, which is commonly achieved by correlation analysis. However, it has been known since the days of Karl Pearson that the analysis of the type of data generated by such techniques (referred to as compositional data can produce unreliable results since the observed data take the form of relative fractions of genes or species, rather than their absolute abundances. Using simulated and real data from the Human Microbiome Project, we show that such compositional effects can be widespread and severe: in some real data sets many of the correlations among taxa can be artifactual, and true correlations may even appear with opposite sign. Additionally, we show that community diversity is the key factor that modulates the acuteness of such compositional effects, and develop a new approach, called SparCC (available at https://bitbucket.org/yonatanf/sparcc, which is capable of estimating correlation values from compositional data. To illustrate a potential application of SparCC, we infer a rich ecological network connecting hundreds of interacting species across 18 sites on the human body. Using the SparCC network as a reference, we estimated that the standard approach yields 3 spurious species-species interactions for each true interaction and misses 60% of the true interactions in the human microbiome data, and, as predicted, most of the erroneous links are found in the samples with the lowest diversity.
Inferring the role of transcription factors in regulatory networks
Directory of Open Access Journals (Sweden)
Le Borgne Michel
2008-05-01
Full Text Available Abstract Background Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. Results We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges, and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions, by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions. In addition, we report predictions for 14.5% of all interactions. Conclusion Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine
Netter: re-ranking gene network inference predictions using structural network properties.
Ruyssinck, Joeri; Demeester, Piet; Dhaene, Tom; Saeys, Yvan
2016-02-09
Many algorithms have been developed to infer the topology of gene regulatory networks from gene expression data. These methods typically produce a ranking of links between genes with associated confidence scores, after which a certain threshold is chosen to produce the inferred topology. However, the structural properties of the predicted network do not resemble those typical for a gene regulatory network, as most algorithms only take into account connections found in the data and do not include known graph properties in their inference process. This lowers the prediction accuracy of these methods, limiting their usability in practice. We propose a post-processing algorithm which is applicable to any confidence ranking of regulatory interactions obtained from a network inference method which can use, inter alia, graphlets and several graph-invariant properties to re-rank the links into a more accurate prediction. To demonstrate the potential of our approach, we re-rank predictions of six different state-of-the-art algorithms using three simple network properties as optimization criteria and show that Netter can improve the predictions made on both artificially generated data as well as the DREAM4 and DREAM5 benchmarks. Additionally, the DREAM5 E.coli. community prediction inferred from real expression data is further improved. Furthermore, Netter compares favorably to other post-processing algorithms and is not restricted to correlation-like predictions. Lastly, we demonstrate that the performance increase is robust for a wide range of parameter settings. Netter is available at http://bioinformatics.intec.ugent.be. Network inference from high-throughput data is a long-standing challenge. In this work, we present Netter, which can further refine network predictions based on a set of user-defined graph properties. Netter is a flexible system which can be applied in unison with any method producing a ranking from omics data. It can be tailored to specific prior
Ecological Network Inference From Long-Term Presence-Absence Data.
Sander, Elizabeth L; Wootton, J Timothy; Allesina, Stefano
2017-08-02
Ecological communities are characterized by complex networks of trophic and nontrophic interactions, which shape the dy-namics of the community. Machine learning and correlational methods are increasingly popular for inferring networks from co-occurrence and time series data, particularly in microbial systems. In this study, we test the suitability of these methods for inferring ecological interactions by constructing networks using Dynamic Bayesian Networks, Lasso regression, and Pear-son's correlation coefficient, then comparing the model networks to empirical trophic and nontrophic webs in two ecological systems. We find that although each model significantly replicates the structure of at least one empirical network, no model significantly predicts network structure in both systems, and no model is clearly superior to the others. We also find that networks inferred for the Tatoosh intertidal match the nontrophic network much more closely than the trophic one, possibly due to the challenges of identifying trophic interactions from presence-absence data. Our findings suggest that although these methods hold some promise for ecological network inference, presence-absence data does not provide enough signal for models to consistently identify interactions, and networks inferred from these data should be interpreted with caution.
Supervised dictionary learning for inferring concurrent brain networks.
Zhao, Shijie; Han, Junwei; Lv, Jinglei; Jiang, Xi; Hu, Xintao; Zhao, Yu; Ge, Bao; Guo, Lei; Liu, Tianming
2015-10-01
Task-based fMRI (tfMRI) has been widely used to explore functional brain networks via predefined stimulus paradigm in the fMRI scan. Traditionally, the general linear model (GLM) has been a dominant approach to detect task-evoked networks. However, GLM focuses on task-evoked or event-evoked brain responses and possibly ignores the intrinsic brain functions. In comparison, dictionary learning and sparse coding methods have attracted much attention recently, and these methods have shown the promise of automatically and systematically decomposing fMRI signals into meaningful task-evoked and intrinsic concurrent networks. Nevertheless, two notable limitations of current data-driven dictionary learning method are that the prior knowledge of task paradigm is not sufficiently utilized and that the establishment of correspondences among dictionary atoms in different brains have been challenging. In this paper, we propose a novel supervised dictionary learning and sparse coding method for inferring functional networks from tfMRI data, which takes both of the advantages of model-driven method and data-driven method. The basic idea is to fix the task stimulus curves as predefined model-driven dictionary atoms and only optimize the other portion of data-driven dictionary atoms. Application of this novel methodology on the publicly available human connectome project (HCP) tfMRI datasets has achieved promising results.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... complex prior representation achieve improved sparsity representations in low signalto- noise ratio as opposed to state-of-the-art sparse estimators. This result is of particular importance for the applicability of the algorithms in the field of channel estimation. We then derive various iterative...
Reasoning about causal relationships: Inferences on causal networks.
Rottman, Benjamin Margolin; Hastie, Reid
2014-01-01
Over the last decade, a normative framework for making causal inferences, Bayesian Probabilistic Causal Networks, has come to dominate psychological studies of inference based on causal relationships. The following causal networks-[X→Y→Z, X←Y→Z, X→Y←Z]-supply answers for questions like, "Suppose both X and Y occur, what is the probability Z occurs?" or "Suppose you intervene and make Y occur, what is the probability Z occurs?" In this review, we provide a tutorial for how normatively to calculate these inferences. Then, we systematically detail the results of behavioral studies comparing human qualitative and quantitative judgments to the normative calculations for many network structures and for several types of inferences on those networks. Overall, when the normative calculations imply that an inference should increase, judgments usually go up; when calculations imply a decrease, judgments usually go down. However, 2 systematic deviations appear. First, people's inferences violate the Markov assumption. For example, when inferring Z from the structure X→Y→Z, people think that X is relevant even when Y completely mediates the relationship between X and Z. Second, even when people's inferences are directionally consistent with the normative calculations, they are often not as sensitive to the parameters and the structure of the network as they should be. We conclude with a discussion of productive directions for future research. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Tracking cohesive subgroups over time in inferred social networks
Chin, Alvin; Chignell, Mark; Wang, Hao
2010-04-01
As a first step in the development of community trackers for large-scale online interaction, this paper shows how cohesive subgroup analysis using the Social Cohesion Analysis of Networks (SCAN; Chin and Chignell 2008) and Data-Intensive Socially Similar Evolving Community Tracker (DISSECT; Chin and Chignell 2010) methods can be applied to the problem of identifying cohesive subgroups and tracking them over time. Three case studies are reported, and the findings are used to evaluate how well the SCAN and DISSECT methods work for different types of data. In the largest of the case studies, variations in temporal cohesiveness are identified across a set of subgroups extracted from the inferred social network. Further modifications to the DISSECT methodology are suggested based on the results obtained. The paper concludes with recommendations concerning further research that would be beneficial in addressing the community tracking problem for online data.
Dimensionality reduction and network inference for sea surface temperature data
Falasca, Fabrizio; Bracco, Annalisa; Nenes, Athanasios; Dovrolis, Constantine; Fountalis, Ilias
2017-04-01
Earth's climate is a complex dynamical system. The underlying components of the system interact with each other (in a linear or non linear way) on several spatial and time scales. Network science provides a set of tools to study the structure and dynamics of such systems. Here we propose an application of a novel network inference method, δ-MAPS, to investigate sea surface temperature (SST) fields in reanalyses and models. δ-MAPS first identifies the underlying components (domains) of the system, modeling them as spatially contiguous, potentially overlapping regions of highly correlated temporal activity, and then infers the weighted and potentially lagged interactions between them. The SST network is represented as a weighted and directed graph. Edge direction captures the temporal ordering of events, while edge weights capture the magnitude of the interaction between the domains. We focus on two reanalysis datasets (HadISST and COBE ) and on a dozen of runs of the CESM model (extracted from the so-called large ensemble). The networks are built using 45 years of data every 3 years for the total dataset temporal coverage (from 1871 to 2015 for HadISST, from 1891 to 2015 for COBE and from 1920 to 2100 for CESM members). We then explore similarities and differences between reanalyses and models in terms of the domains identified, the networks inferred and their time evolution. The spatial extent and shape of the identified domains is consistent between observations and models. According to our analysis the largest SST domain always corresponds to the El Niño Southern Oscillation (ENSO) while most of the other domains correspond to known climate modes. However, the network structure shows significant differences. For example, the unique role played by the South Tropical Atlantic in the observed network is not captured by any model run. Regarding the time evolution of the system we focus on the strength of ENSO: while we observe a positive trend for observations and
Using Network Methodology to Infer Population Substructure.
Directory of Open Access Journals (Sweden)
Dmitry Prokopenko
Full Text Available One of the main caveats of association studies is the possible affection by bias due to population stratification. Existing methods rely on model-based approaches like structure and ADMIXTURE or on principal component analysis like EIGENSTRAT. Here we provide a novel visualization technique and describe the problem of population substructure from a graph-theoretical point of view. We group the sequenced individuals into triads, which depict the relational structure, on the basis of a predefined pairwise similarity measure. We then merge the triads into a network and apply community detection algorithms in order to identify homogeneous subgroups or communities, which can further be incorporated as covariates into logistic regression. We apply our method to populations from different continents in the 1000 Genomes Project and evaluate the type 1 error based on the empirical p-values. The application to 1000 Genomes data suggests that the network approach provides a very fine resolution of the underlying ancestral population structure. Besides we show in simulations, that in the presence of discrete population structures, our developed approach maintains the type 1 error more precisely than existing approaches.
Comparison of evolutionary algorithms in gene regulatory network model inference.
LENUS (Irish Health Repository)
2010-01-01
ABSTRACT: BACKGROUND: The evolution of high throughput technologies that measure gene expression levels has created a data base for inferring GRNs (a process also known as reverse engineering of GRNs). However, the nature of these data has made this process very difficult. At the moment, several methods of discovering qualitative causal relationships between genes with high accuracy from microarray data exist, but large scale quantitative analysis on real biological datasets cannot be performed, to date, as existing approaches are not suitable for real microarray data which are noisy and insufficient. RESULTS: This paper performs an analysis of several existing evolutionary algorithms for quantitative gene regulatory network modelling. The aim is to present the techniques used and offer a comprehensive comparison of approaches, under a common framework. Algorithms are applied to both synthetic and real gene expression data from DNA microarrays, and ability to reproduce biological behaviour, scalability and robustness to noise are assessed and compared. CONCLUSIONS: Presented is a comparison framework for assessment of evolutionary algorithms, used to infer gene regulatory networks. Promising methods are identified and a platform for development of appropriate model formalisms is established.
gCoda: Conditional Dependence Network Inference for Compositional Data.
Fang, Huaying; Huang, Chengcheng; Zhao, Hongyu; Deng, Minghua
2017-07-01
The increasing quality and the reducing cost of high-throughput sequencing technologies for 16S rRNA gene profiling enable researchers to directly analyze microbe communities in natural environments. The direct interactions among microbial species of a given ecological system can help us understand the principles of community assembly and maintenance under various conditions. Compositionality and dimensionality of microbiome data are two main challenges for inferring the direct interaction network of microbes. In this article, we use the logistic normal distribution to model the background mechanism of microbiome data, which can appropriately deal with the compositional nature of the data. The direct interaction relationships are then modeled via the conditional dependence network under this logistic normal assumption. We then propose a novel penalized maximum likelihood method called gCoda to estimate the sparse structure of inverse covariance for latent normal variables to address the high dimensionality of the microbiome data. An effective Majorization-Minimization algorithm is proposed to solve the optimization problem in gCoda. Simulation studies show that gCoda outperforms existing methods (e.g., SPIEC-EASI) in edge recovery of inverse covariance for compositional data under a variety of scenarios. gCoda also performs better than SPIEC-EASI for inferring direct microbial interactions of mouse skin microbiome data.
Inferring dynamic gene networks under varying conditions for transcriptomic network comparison.
Shimamura, Teppei; Imoto, Seiya; Yamaguchi, Rui; Nagasaki, Masao; Miyano, Satoru
2010-04-15
Elucidating the differences between cellular responses to various biological conditions or external stimuli is an important challenge in systems biology. Many approaches have been developed to reverse engineer a cellular system, called gene network, from time series microarray data in order to understand a transcriptomic response under a condition of interest. Comparative topological analysis has also been applied based on the gene networks inferred independently from each of the multiple time series datasets under varying conditions to find critical differences between these networks. However, these comparisons often lead to misleading results, because each network contains considerable noise due to the limited length of the time series. We propose an integrated approach for inferring multiple gene networks from time series expression data under varying conditions. To the best of our knowledge, our approach is the first reverse-engineering method that is intended for transcriptomic network comparison between varying conditions. Furthermore, we propose a state-of-the-art parameter estimation method, relevance-weighted recursive elastic net, for providing higher precision and recall than existing reverse-engineering methods. We analyze experimental data of MCF-7 human breast cancer cells stimulated by epidermal growth factor or heregulin with several doses and provide novel biological hypotheses through network comparison. The software NETCOMP is available at http://bonsai.ims.u-tokyo.ac.jp/ approximately shima/NETCOMP/.
Gene-network inference by message passing
Braunstein, A.; Pagnani, A.; Weigt, M.; Zecchina, R.
2008-01-01
The inference of gene-regulatory processes from gene-expression data belongs to the major challenges of computational systems biology. Here we address the problem from a statistical-physics perspective and develop a message-passing algorithm which is able to infer sparse, directed and combinatorial regulatory mechanisms. Using the replica technique, the algorithmic performance can be characterized analytically for artificially generated data. The algorithm is applied to genome-wide expression data of baker's yeast under various environmental conditions. We find clear cases of combinatorial control, and enrichment in common functional annotations of regulated genes and their regulators.
Inferring Social Relations from Online and Communication Networks
Nasim, Mehwish
2016-01-01
In this work analyzed the interplay between social relations in the form of friendship ties, attributes and interaction in online social networks. In this context we analyzed composition of social circles in online social networks and showed that social circles are homophilious with respect to at least one node attribute. We showed that using the right combination of network and interaction features, links can be inferred in online covert networks. We also analyzed longitudinal dyadic interac...
Explaining Inference on a Population of Independent Agents Using Bayesian Networks
Sutovsky, Peter
2013-01-01
The main goal of this research is to design, implement, and evaluate a novel explanation method, the hierarchical explanation method (HEM), for explaining Bayesian network (BN) inference when the network is modeling a population of conditionally independent agents, each of which is modeled as a subnetwork. For example, consider disease-outbreak…
Directory of Open Access Journals (Sweden)
Moslem Yousefi
2015-12-01
Full Text Available Accurate Wind speed forecasting has a vital role in efficient utilization of wind farms. Wind forecasting could be performed for long or short time horizons. Given the volatile nature of wind and its dependent on many geographical parameters, it is difficult for traditional methods to provide a reliable forecast of wind speed time series. In this study, an attempt is made to establish an efficient adaptive network-based fuzzy interference (ANFIS for short-term wind speed forecasting. Using the available data sets in the literature, the ANFIS network is constructed, tested and the results are compared with that of a regular neural network, which has been forecasted the same set of dataset in previous studies. To avoid trial-and-error process for selection of the ANFIS input data, the results of autocorrelation factor (ACF and partial auto correlation factor (PACF on the historical wind speed data are employed. The available data set is divided into two parts. 50% for training and 50% for testing and validation. The testing part of data set will be merely used for assessing the performance of the neural network which guarantees that only unseen data is used to evaluate the forecasting performance of the network. On the other hand, validation data could be used for parameter-setting of the network if required. The results indicate that ANFIS could not outperform ANN in short-term wind speed forecasting though its results are competitive. The two methods are hybridized, though simply by weightage, and the hybrid methods shows slight improvement comparing to both ANN and ANFIS results. Therefore, the goal of future studies could be implementing ANFIS and ANNs in a more comprehensive ensemble method which could be ultimately more robust and accurate
Energy-Efficient and Robust In-Network Inference in Wireless Sensor Networks.
Zhao, Wei; Liang, Yao
2015-10-01
Distributed in-network inference plays a significant role in large-scale wireless sensor networks (WSNs) in various applications for distributed detection and estimation. While belief propagation (BP) holds great potential for forming a powerful underlying mechanism for such distributed in-network inferences in WSNs, one major challenge is how to systematically improve the energy efficiency of BP-based in-network inference in WSNs. In this paper, we first propose a systematic and rigorous data-driven approach to building information models for WSN applications upon which BP-based in-network inference can be effectively and efficiently performed. We then present a wavelet-based BP framework for multiresolution inference, with respect to our WSN information modeling, to further reduce WSNs' energy. We empirically evaluate our proposed WSN information modeling and wavelet-based BP framework/multiresolution inference using real-world sensor network data. The results demonstrate the merits of our proposed approaches.
Congested Link Inference Algorithms in Dynamic Routing IP Network
Directory of Open Access Journals (Sweden)
Yu Chen
2017-01-01
Full Text Available The performance descending of current congested link inference algorithms is obviously in dynamic routing IP network, such as the most classical algorithm CLINK. To overcome this problem, based on the assumptions of Markov property and time homogeneity, we build a kind of Variable Structure Discrete Dynamic Bayesian (VSDDB network simplified model of dynamic routing IP network. Under the simplified VSDDB model, based on the Bayesian Maximum A Posteriori (BMAP and Rest Bayesian Network Model (RBNM, we proposed an Improved CLINK (ICLINK algorithm. Considering the concurrent phenomenon of multiple link congestion usually happens, we also proposed algorithm CLILRS (Congested Link Inference algorithm based on Lagrangian Relaxation Subgradient to infer the set of congested links. We validated our results by the experiments of analogy, simulation, and actual Internet.
Perturbation Biology: Inferring Signaling Networks in Cellular Systems
Miller, Martin L.; Gauthier, Nicholas P.; Jing, Xiaohong; Kaushik, Poorvi; He, Qin; Mills, Gordon; Solit, David B.; Pratilas, Christine A.; Weigt, Martin; Braunstein, Alfredo; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris
2013-01-01
We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology. PMID:24367245
MPE inference in conditional linear gaussian networks
DEFF Research Database (Denmark)
Salmerón, Antonio; Rumí, Rafael; Langseth, Helge
2015-01-01
Given evidence on a set of variables in a Bayesian network, the most probable explanation (MPE) is the problem of finding a configuration of the remaining variables with maximum posterior probability. This problem has previously been addressed for discrete Bayesian networks and can be solved usin...
Correlated measurement error hampers association network inference
Kaduk, M.; Hoefsloot, H.C.J.; Vis, D.J.; Reijmers, T.; Greef, J. van der; Smilde, A.K.; Hendriks, M.M.W.B.
2014-01-01
Modern chromatography-based metabolomics measurements generate large amounts of data in the form of abundances of metabolites. An increasingly popular way of representing and analyzing such data is by means of association networks. Ideally, such a network can be interpreted in terms of the
Network inference from multimodal data: A review of approaches from infectious disease transmission.
Ray, Bisakha; Ghedin, Elodie; Chunara, Rumi
2016-12-01
Networks inference problems are commonly found in multiple biomedical subfields such as genomics, metagenomics, neuroscience, and epidemiology. Networks are useful for representing a wide range of complex interactions ranging from those between molecular biomarkers, neurons, and microbial communities, to those found in human or animal populations. Recent technological advances have resulted in an increasing amount of healthcare data in multiple modalities, increasing the preponderance of network inference problems. Multi-domain data can now be used to improve the robustness and reliability of recovered networks from unimodal data. For infectious diseases in particular, there is a body of knowledge that has been focused on combining multiple pieces of linked information. Combining or analyzing disparate modalities in concert has demonstrated greater insight into disease transmission than could be obtained from any single modality in isolation. This has been particularly helpful in understanding incidence and transmission at early stages of infections that have pandemic potential. Novel pieces of linked information in the form of spatial, temporal, and other covariates including high-throughput sequence data, clinical visits, social network information, pharmaceutical prescriptions, and clinical symptoms (reported as free-text data) also encourage further investigation of these methods. The purpose of this review is to provide an in-depth analysis of multimodal infectious disease transmission network inference methods with a specific focus on Bayesian inference. We focus on analytical Bayesian inference-based methods as this enables recovering multiple parameters simultaneously, for example, not just the disease transmission network, but also parameters of epidemic dynamics. Our review studies their assumptions, key inference parameters and limitations, and ultimately provides insights about improving future network inference methods in multiple applications
2014-01-01
Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel
Quantifying the multi-scale performance of network inference algorithms.
Oates, Chris J; Amos, Richard; Spencer, Simon E F
2014-10-01
Graphical models are widely used to study complex multivariate biological systems. Network inference algorithms aim to reverse-engineer such models from noisy experimental data. It is common to assess such algorithms using techniques from classifier analysis. These metrics, based on ability to correctly infer individual edges, possess a number of appealing features including invariance to rank-preserving transformation. However, regulation in biological systems occurs on multiple scales and existing metrics do not take into account the correctness of higher-order network structure. In this paper novel performance scores are presented that share the appealing properties of existing scores, whilst capturing ability to uncover regulation on multiple scales. Theoretical results confirm that performance of a network inference algorithm depends crucially on the scale at which inferences are to be made; in particular strong local performance does not guarantee accurate reconstruction of higher-order topology. Applying these scores to a large corpus of data from the DREAM5 challenge, we undertake a data-driven assessment of estimator performance. We find that the "wisdom of crowds" network, that demonstrated superior local performance in the DREAM5 challenge, is also among the best performing methodologies for inference of regulation on multiple length scales.
King, Benjamin L; Davis, Allan Peter; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J
2012-01-01
Exposure to chemicals in the environment is believed to play a critical role in the etiology of many human diseases. To enhance understanding about environmental effects on human health, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org) provides unique curated data that enable development of novel hypotheses about the relationships between chemicals and diseases. CTD biocurators read the literature and curate direct relationships between chemicals-genes, genes-diseases, and chemicals-diseases. These direct relationships are then computationally integrated to create additional inferred relationships; for example, a direct chemical-gene statement can be combined with a direct gene-disease statement to generate a chemical-disease inference (inferred via the shared gene). In CTD, the number of inferences has increased exponentially as the number of direct chemical, gene and disease interactions has grown. To help users navigate and prioritize these inferences for hypothesis development, we implemented a statistic to score and rank them based on the topology of the local network consisting of the chemical, disease and each of the genes used to make an inference. In this network, chemicals, diseases and genes are nodes connected by edges representing the curated interactions. Like other biological networks, node connectivity is an important consideration when evaluating the CTD network, as the connectivity of nodes follows the power-law distribution. Topological methods reduce the influence of highly connected nodes that are present in biological networks. We evaluated published methods that used local network topology to determine the reliability of protein-protein interactions derived from high-throughput assays. We developed a new metric that combines and weights two of these methods and uniquely takes into account the number of common neighbors and the connectivity of each entity involved. We present several CTD inferences as case studies to
Directory of Open Access Journals (Sweden)
Benjamin L King
Full Text Available Exposure to chemicals in the environment is believed to play a critical role in the etiology of many human diseases. To enhance understanding about environmental effects on human health, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org provides unique curated data that enable development of novel hypotheses about the relationships between chemicals and diseases. CTD biocurators read the literature and curate direct relationships between chemicals-genes, genes-diseases, and chemicals-diseases. These direct relationships are then computationally integrated to create additional inferred relationships; for example, a direct chemical-gene statement can be combined with a direct gene-disease statement to generate a chemical-disease inference (inferred via the shared gene. In CTD, the number of inferences has increased exponentially as the number of direct chemical, gene and disease interactions has grown. To help users navigate and prioritize these inferences for hypothesis development, we implemented a statistic to score and rank them based on the topology of the local network consisting of the chemical, disease and each of the genes used to make an inference. In this network, chemicals, diseases and genes are nodes connected by edges representing the curated interactions. Like other biological networks, node connectivity is an important consideration when evaluating the CTD network, as the connectivity of nodes follows the power-law distribution. Topological methods reduce the influence of highly connected nodes that are present in biological networks. We evaluated published methods that used local network topology to determine the reliability of protein-protein interactions derived from high-throughput assays. We developed a new metric that combines and weights two of these methods and uniquely takes into account the number of common neighbors and the connectivity of each entity involved. We present several CTD inferences as case
Data identification for improving gene network inference using computational algebra.
Dimitrova, Elena; Stigler, Brandilyn
2014-11-01
Identification of models of gene regulatory networks is sensitive to the amount of data used as input. Considering the substantial costs in conducting experiments, it is of value to have an estimate of the amount of data required to infer the network structure. To minimize wasted resources, it is also beneficial to know which data are necessary to identify the network. Knowledge of the data and knowledge of the terms in polynomial models are often required a priori in model identification. In applications, it is unlikely that the structure of a polynomial model will be known, which may force data sets to be unnecessarily large in order to identify a model. Furthermore, none of the known results provides any strategy for constructing data sets to uniquely identify a model. We provide a specialization of an existing criterion for deciding when a set of data points identifies a minimal polynomial model when its monomial terms have been specified. Then, we relax the requirement of the knowledge of the monomials and present results for model identification given only the data. Finally, we present a method for constructing data sets that identify minimal polynomial models.
Estimating uncertainty and reliability of social network data using Bayesian inference.
Farine, Damien R; Strandburg-Peshkin, Ariana
2015-09-01
Social network analysis provides a useful lens through which to view the structure of animal societies, and as a result its use is increasingly widespread. One challenge that many studies of animal social networks face is dealing with limited sample sizes, which introduces the potential for a high level of uncertainty in estimating the rates of association or interaction between individuals. We present a method based on Bayesian inference to incorporate uncertainty into network analyses. We test the reliability of this method at capturing both local and global properties of simulated networks, and compare it to a recently suggested method based on bootstrapping. Our results suggest that Bayesian inference can provide useful information about the underlying certainty in an observed network. When networks are well sampled, observed networks approach the real underlying social structure. However, when sampling is sparse, Bayesian inferred networks can provide realistic uncertainty estimates around edge weights. We also suggest a potential method for estimating the reliability of an observed network given the amount of sampling performed. This paper highlights how relatively simple procedures can be used to estimate uncertainty and reliability in studies using animal social network analysis.
Directory of Open Access Journals (Sweden)
Frank Emmert-Streib
2013-02-01
Full Text Available The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I observational gene expression data: normal environmental condition, (II interventional gene expression data: growth in rich media, (III interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.
Social networks help to infer causality in the tumor microenvironment.
Crespo, Isaac; Doucey, Marie-Agnès; Xenarios, Ioannis
2016-03-15
Networks have become a popular way to conceptualize a system of interacting elements, such as electronic circuits, social communication, metabolism or gene regulation. Network inference, analysis, and modeling techniques have been developed in different areas of science and technology, such as computer science, mathematics, physics, and biology, with an active interdisciplinary exchange of concepts and approaches. However, some concepts seem to belong to a specific field without a clear transferability to other domains. At the same time, it is increasingly recognized that within some biological systems--such as the tumor microenvironment--where different types of resident and infiltrating cells interact to carry out their functions, the complexity of the system demands a theoretical framework, such as statistical inference, graph analysis and dynamical models, in order to asses and study the information derived from high-throughput experimental technologies. In this article we propose to adopt and adapt the concepts of influence and investment from the world of social network analysis to biological problems, and in particular to apply this approach to infer causality in the tumor microenvironment. We showed that constructing a bidirectional network of influence between cell and cell communication molecules allowed us to determine the direction of inferred regulations at the expression level and correctly recapitulate cause-effect relationships described in literature. This work constitutes an example of a transfer of knowledge and concepts from the world of social network analysis to biomedical research, in particular to infer network causality in biological networks. This causality elucidation is essential to model the homeostatic response of biological systems to internal and external factors, such as environmental conditions, pathogens or treatments.
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
DEFF Research Database (Denmark)
Møller, Jesper
.1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4...... is on spatial point processes....
Directory of Open Access Journals (Sweden)
Shuhei Kimura
Full Text Available The inference of a genetic network is a problem in which mutual interactions among genes are inferred from time-series of gene expression levels. While a number of models have been proposed to describe genetic networks, this study focuses on a mathematical model proposed by Vohradský. Because of its advantageous features, several researchers have proposed the inference methods based on Vohradský's model. When trying to analyze large-scale networks consisting of dozens of genes, however, these methods must solve high-dimensional non-linear function optimization problems. In order to resolve the difficulty of estimating the parameters of the Vohradský's model, this study proposes a new method that defines the problem as several two-dimensional function optimization problems. Through numerical experiments on artificial genetic network inference problems, we showed that, although the computation time of the proposed method is not the shortest, the method has the ability to estimate parameters of Vohradský's models more effectively with sufficiently short computation times. This study then applied the proposed method to an actual inference problem of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations.
Inferring biomolecular interaction networks based on convex optimization.
Han, Soohee; Yoon, Yeoin; Cho, Kwang-Hyun
2007-10-01
We present an optimization-based inference scheme to unravel the functional interaction structure of biomolecular components within a cell. The regulatory network of a cell is inferred from the data obtained by perturbation of adjustable parameters or initial concentrations of specific components. It turns out that the identification procedure leads to a convex optimization problem with regularization as we have to achieve the sparsity of a network and also reflect any a priori information on the network structure. Since the convex optimization has been well studied for a long time, a variety of efficient algorithms were developed and many numerical solvers are freely available. In order to estimate time derivatives from discrete-time samples, a cubic spline fitting is incorporated into the proposed optimization procedure. Throughout simulation studies on several examples, it is shown that the proposed convex optimization scheme can effectively uncover the functional interaction structure of a biomolecular regulatory network with reasonable accuracy.
Inferring gene regulatory networks by singular value decomposition and gravitation field algorithm.
Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms.
Personalized microbial network inference via co-regularized spectral clustering
Imangaliyev, S.; Keijser, B.; Crielaard, W.; Tsivtsivadze, E.
2014-01-01
We use Human Microbiome Project (HMP) cohort [1] to infer personalized oral microbial networks of healthy individuals. To determine clustering of individuals with similar microbial profiles, co-regularized spectral clustering algorithm is applied to the dataset. For each cluster we discovered, we
Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks
Directory of Open Access Journals (Sweden)
Alina Sîrbu
2015-05-01
Full Text Available Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions. Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come.
Learning a Markov Logic network for supervised gene regulatory network inference.
Brouard, Céline; Vrain, Christel; Dubois, Julie; Castel, David; Debily, Marie-Anne; d'Alché-Buc, Florence
2013-09-12
Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate "regulates", starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Directory of Open Access Journals (Sweden)
Joeri Ruyssinck
Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
Inference of gene regulatory networks from time series by Tsallis entropy
Directory of Open Access Journals (Sweden)
de Oliveira Evaldo A
2011-05-01
Full Text Available Abstract Background The inference of gene regulatory networks (GRNs from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information, a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5
Inferring personal economic status from social network location
Luo, Shaojun; Morone, Flaviano; Sarraute, Carlos; Travizano, Matías; Makse, Hernán A.
2017-05-01
It is commonly believed that patterns of social ties affect individuals' economic status. Here we translate this concept into an operational definition at the network level, which allows us to infer the economic well-being of individuals through a measure of their location and influence in the social network. We analyse two large-scale sources: telecommunications and financial data of a whole country's population. Our results show that an individual's location, measured as the optimal collective influence to the structural integrity of the social network, is highly correlated with personal economic status. The observed social network patterns of influence mimic the patterns of economic inequality. For pragmatic use and validation, we carry out a marketing campaign that shows a threefold increase in response rate by targeting individuals identified by our social network metrics as compared to random targeting. Our strategy can also be useful in maximizing the effects of large-scale economic stimulus policies.
Sign Inference for Dynamic Signed Networks via Dictionary Learning
Directory of Open Access Journals (Sweden)
Yi Cen
2013-01-01
Full Text Available Mobile online social network (mOSN is a burgeoning research area. However, most existing works referring to mOSNs deal with static network structures and simply encode whether relationships among entities exist or not. In contrast, relationships in signed mOSNs can be positive or negative and may be changed with time and locations. Applying certain global characteristics of social balance, in this paper, we aim to infer the unknown relationships in dynamic signed mOSNs and formulate this sign inference problem as a low-rank matrix estimation problem. Specifically, motivated by the Singular Value Thresholding (SVT algorithm, a compact dictionary is selected from the observed dataset. Based on this compact dictionary, the relationships in the dynamic signed mOSNs are estimated via solving the formulated problem. Furthermore, the estimation accuracy is improved by employing a dictionary self-updating mechanism.
Inferring influenza global transmission networks without complete phylogenetic information.
Aris-Brosou, Stéphane
2014-03-01
Influenza is one of the most severe respiratory infections affecting humans throughout the world, yet the dynamics of its global transmission network are still contentious. Here, I describe a novel combination of phylogenetics, time series, and graph theory to analyze 14.25 years of data stratified in space and in time, focusing on the main target of the human immune response, the hemagglutinin gene. While bypassing the complete phylogenetic inference of huge data sets, the method still extracts information suggesting that waves of genetic or of nucleotide diversity circulate continuously around the globe for subtypes that undergo sustained transmission over several seasons, such as H3N2 and pandemic H1N1/09, while diversity of prepandemic H1N1 viruses had until 2009 a noncontinuous transmission pattern consistent with a source/sink model. Irrespective of the shift in the structure of H1N1 diversity circulation with the emergence of the pandemic H1N1/09 strain, US prevalence peaks during the winter months when genetic diversity is at its lowest. This suggests that a dominant strain is generally responsible for epidemics and that monitoring genetic and/or nucleotide diversity in real time could provide public health agencies with an indirect estimate of prevalence.
Directory of Open Access Journals (Sweden)
Korf Ulrike
2011-07-01
Full Text Available Abstract Background Network inference from high-throughput data has become an important means of current analysis of biological systems. For instance, in cancer research, the functional relationships of cancer related proteins, summarised into signalling networks are of central interest for the identification of pathways that influence tumour development. Cancer cell lines can be used as model systems to study the cellular response to drug treatments in a time-resolved way. Based on these kind of data, modelling approaches for the signalling relationships are needed, that allow to generate hypotheses on potential interference points in the networks. Results We present the R-package 'ddepn' that implements our recent approach on network reconstruction from longitudinal data generated after external perturbation of network components. We extend our approach by two novel methods: a Markov Chain Monte Carlo method for sampling network structures with two edge types (activation and inhibition and an extension of a prior model that penalises deviances from a given reference network while incorporating these two types of edges. Further, as alternative prior we include a model that learns signalling networks with the scale-free property. Conclusions The package 'ddepn' is freely available on R-Forge and CRAN http://ddepn.r-forge.r-project.org, http://cran.r-project.org. It allows to conveniently perform network inference from longitudinal high-throughput data using two different sampling based network structure search algorithms.
SiGNet: A signaling network data simulator to enable signaling network inference.
Directory of Open Access Journals (Sweden)
Elizabeth A Coker
Full Text Available Network models are widely used to describe complex signaling systems. Cellular wiring varies in different cellular contexts and numerous inference techniques have been developed to infer the structure of a network from experimental data of the network's behavior. To objectively identify which inference strategy is best suited to a specific network, a gold standard network and dataset are required. However, suitable datasets for benchmarking are difficult to find. Numerous tools exist that can simulate data for transcriptional networks, but these are of limited use for the study of signaling networks. Here, we describe SiGNet (Signal Generator for Networks: a Cytoscape app that simulates experimental data for a signaling network of known structure. SiGNet has been developed and tested against published experimental data, incorporating information on network architecture, and the directionality and strength of interactions to create biological data in silico. SiGNet is the first tool to simulate biological signaling data, enabling an accurate and systematic assessment of inference strategies. SiGNet can also be used to produce preliminary models of key biological pathways following perturbation.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Mathematical inference and control of molecular networks from perturbation experiments
Mohammed-Rasheed, Mohammed
One of the main challenges facing biologists and mathematicians in the post genomic era is to understand the behavior of molecular networks and harness this understanding into an educated intervention of the cell. The cell maintains its function via an elaborate network of interconnecting positive and negative feedback loops of genes, RNA and proteins that send different signals to a large number of pathways and molecules. These structures are referred to as genetic regulatory networks (GRNs) or molecular networks. GRNs can be viewed as dynamical systems with inherent properties and mechanisms, such as steady-state equilibriums and stability, that determine the behavior of the cell. The biological relevance of the mathematical concepts are important as they may predict the differentiation of a stem cell, the maintenance of a normal cell, the development of cancer and its aberrant behavior, and the design of drugs and response to therapy. Uncovering the underlying GRN structure from gene/protein expression data, e.g., microarrays or perturbation experiments, is called inference or reverse engineering of the molecular network. Because of the high cost and time consuming nature of biological experiments, the number of available measurements or experiments is very small compared to the number of molecules (genes, RNA and proteins). In addition, the observations are noisy, where the noise is due to the measurements imperfections as well as the inherent stochasticity of genetic expression levels. Intra-cellular activities and extra-cellular environmental attributes are also another source of variability. Thus, the inference of GRNs is, in general, an under-determined problem with a highly noisy set of observations. The ultimate goal of GRN inference and analysis is to be able to intervene within the network, in order to force it away from undesirable cellular states and into desirable ones. However, it remains a major challenge to design optimal intervention strategies
Inferring cellular regulatory networks with Bayesian model averaging for linear regression (BMALR).
Huang, Xun; Zi, Zhike
2014-08-01
Bayesian network and linear regression methods have been widely applied to reconstruct cellular regulatory networks. In this work, we propose a Bayesian model averaging for linear regression (BMALR) method to infer molecular interactions in biological systems. This method uses a new closed form solution to compute the posterior probabilities of the edges from regulators to the target gene within a hybrid framework of Bayesian model averaging and linear regression methods. We have assessed the performance of BMALR by benchmarking on both in silico DREAM datasets and real experimental datasets. The results show that BMALR achieves both high prediction accuracy and high computational efficiency across different benchmarks. A pre-processing of the datasets with the log transformation can further improve the performance of BMALR, leading to a new top overall performance. In addition, BMALR can achieve robust high performance in community predictions when it is combined with other competing methods. The proposed method BMALR is competitive compared to the existing network inference methods. Therefore, BMALR will be useful to infer regulatory interactions in biological networks. A free open source software tool for the BMALR algorithm is available at https://sites.google.com/site/bmalr4netinfer/.
Directory of Open Access Journals (Sweden)
Xiaodong Cai
Full Text Available Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL, for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL based scheme, and the QTL-directed dependency graph (QDG method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Cascading Failures in Networks: Inference, Intervention and Robustness to WMDs
2016-08-01
8725 John J. Kingman Road, MS 6201 Fort Belvoir, VA 22060-6201 T E C H N IC A L R E P O R T DTRA-TR-16-83 Cascading Failures in Networks...Austin Project Title: Cascading Failures in Networks: Inference, Intervention and Robustness to WMDs What are the major goals of...networks are prone to creating cascading failures : events where the initial destruction/compromising of a few nodes results in the
Inference and Evolutionary Analysis of Genome-Scale Regulatory Networks in Large Phylogenies.
Koch, Christopher; Konieczka, Jay; Delorey, Toni; Lyons, Ana; Socha, Amanda; Davis, Kathleen; Knaack, Sara A; Thompson, Dawn; O'Shea, Erin K; Regev, Aviv; Roy, Sushmita
2017-05-24
Changes in transcriptional regulatory networks can significantly contribute to species evolution and adaptation. However, identification of genome-scale regulatory networks is an open challenge, especially in non-model organisms. Here, we introduce multi-species regulatory network learning (MRTLE), a computational approach that uses phylogenetic structure, sequence-specific motifs, and transcriptomic data, to infer the regulatory networks in different species. Using simulated data from known networks and transcriptomic data from six divergent yeasts, we demonstrate that MRTLE predicts networks with greater accuracy than existing methods because it incorporates phylogenetic information. We used MRTLE to infer the structure of the transcriptional networks that control the osmotic stress responses of divergent, non-model yeast species and then validated our predictions experimentally. Interrogating these networks reveals that gene duplication promotes network divergence across evolution. Taken together, our approach facilitates study of regulatory network evolutionary dynamics across multiple poorly studied species. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Bayesian inference for duplication-mutation with complementarity network models.
Jasra, Ajay; Persing, Adam; Beskos, Alexandros; Heine, Kari; De Iorio, Maria
2015-11-01
We observe an undirected graph G without multiple edges and self-loops, which is to represent a protein-protein interaction (PPI) network. We assume that G evolved under the duplication-mutation with complementarity (DMC) model from a seed graph, G0, and we also observe the binary forest Γ that represents the duplication history of G. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis-Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters.
Bayesian Inference for Duplication–Mutation with Complementarity Network Models
Persing, Adam; Beskos, Alexandros; Heine, Kari; De Iorio, Maria
2015-01-01
Abstract We observe an undirected graph G without multiple edges and self-loops, which is to represent a protein–protein interaction (PPI) network. We assume that G evolved under the duplication–mutation with complementarity (DMC) model from a seed graph, G0, and we also observe the binary forest Γ that represents the duplication history of G. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis–Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters. PMID:26355682
Li, Peng; Gong, Ping; Li, Haoni; Perkins, Edward J; Wang, Nan; Zhang, Chaoyang
2014-12-01
The Dialogue for Reverse Engineering Assessments and Methods (DREAM) project was initiated in 2006 as a community-wide effort for the development of network inference challenges for rigorous assessment of reverse engineering methods for biological networks. We participated in the in silico network inference challenge of DREAM3 in 2008. Here we report the details of our approach and its performance on the synthetic challenge datasets. In our methodology, we first developed a model called relative change ratio (RCR), which took advantage of the heterozygous knockdown data and null-mutant knockout data provided by the challenge, in order to identify the potential regulators for the genes. With this information, a time-delayed dynamic Bayesian network (TDBN) approach was then used to infer gene regulatory networks from time series trajectory datasets. Our approach considerably reduced the searching space of TDBN; hence, it gained a much higher efficiency and accuracy. The networks predicted using our approach were evaluated comparatively along with 29 other submissions by two metrics (area under the ROC curve and area under the precision-recall curve). The overall performance of our approach ranked the second among all participating teams.
Aghdam, Rosa; Ganjali, Mojtaba; Zhang, Xiujun; Eslahchi, Changiz
2015-03-01
Inferring Gene Regulatory Networks (GRNs) from gene expression data is a major challenge in systems biology. The Path Consistency (PC) algorithm is one of the popular methods in this field. However, as an order dependent algorithm, PC algorithm is not robust because it achieves different network topologies if gene orders are permuted. In addition, the performance of this algorithm depends on the threshold value used for independence tests. Consequently, selecting suitable sequential ordering of nodes and an appropriate threshold value for the inputs of PC algorithm are challenges to infer a good GRN. In this work, we propose a heuristic algorithm, namely SORDER, to find a suitable sequential ordering of nodes. Based on the SORDER algorithm and a suitable interval threshold for Conditional Mutual Information (CMI) tests, a network inference method, namely the Consensus Network (CN), has been developed. In the proposed method, for each edge of the complete graph, a weighted value is defined. This value is considered as the reliability value of dependency between two nodes. The final inferred network, obtained using the CN algorithm, contains edges with a reliability value of dependency of more than a defined threshold. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network in Escherichia coli. The results indicate that the CN algorithm is suitable for learning GRNs and it considerably improves the precision of network inference. The source of data sets and codes are available at .
Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference
Jiang, Jing; Lu, Weiqiang; Li, Weihua; Liu, Guixia; Zhou, Weixing; Huang, Jin; Tang, Yun
2012-01-01
Drug-target interaction (DTI) is the basis of drug discovery and design. It is time consuming and costly to determine DTI experimentally. Hence, it is necessary to develop computational methods for the prediction of potential DTI. Based on complex network theory, three supervised inference methods were developed here to predict DTI and used for drug repositioning, namely drug-based similarity inference (DBSI), target-based similarity inference (TBSI) and network-based inference (NBI). Among them, NBI performed best on four benchmark data sets. Then a drug-target network was created with NBI based on 12,483 FDA-approved and experimental drug-target binary links, and some new DTIs were further predicted. In vitro assays confirmed that five old drugs, namely montelukast, diclofenac, simvastatin, ketoconazole, and itraconazole, showed polypharmacological features on estrogen receptors or dipeptidyl peptidase-IV with half maximal inhibitory or effective concentration ranged from 0.2 to 10 µM. Moreover, simvastatin and ketoconazole showed potent antiproliferative activities on human MDA-MB-231 breast cancer cell line in MTT assays. The results indicated that these methods could be powerful tools in prediction of DTIs and drug repositioning. PMID:22589709
Prediction of drug-target interactions and drug repositioning via network-based inference.
Directory of Open Access Journals (Sweden)
Feixiong Cheng
Full Text Available Drug-target interaction (DTI is the basis of drug discovery and design. It is time consuming and costly to determine DTI experimentally. Hence, it is necessary to develop computational methods for the prediction of potential DTI. Based on complex network theory, three supervised inference methods were developed here to predict DTI and used for drug repositioning, namely drug-based similarity inference (DBSI, target-based similarity inference (TBSI and network-based inference (NBI. Among them, NBI performed best on four benchmark data sets. Then a drug-target network was created with NBI based on 12,483 FDA-approved and experimental drug-target binary links, and some new DTIs were further predicted. In vitro assays confirmed that five old drugs, namely montelukast, diclofenac, simvastatin, ketoconazole, and itraconazole, showed polypharmacological features on estrogen receptors or dipeptidyl peptidase-IV with half maximal inhibitory or effective concentration ranged from 0.2 to 10 µM. Moreover, simvastatin and ketoconazole showed potent antiproliferative activities on human MDA-MB-231 breast cancer cell line in MTT assays. The results indicated that these methods could be powerful tools in prediction of DTIs and drug repositioning.
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks
Energy Technology Data Exchange (ETDEWEB)
Kim, M; Leskovec, J
2011-11-14
Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.
2018-01-01
Signaling pathways represent parts of the global biological molecular network which connects them into a seamless whole through complex direct and indirect (hidden) crosstalk whose structure can change during development or in pathological conditions. We suggest a novel methodology, called Googlomics, for the structural analysis of directed biological networks using spectral analysis of their Google matrices, using parallels with quantum scattering theory, developed for nuclear and mesoscopic physics and quantum chaos. We introduce analytical “reduced Google matrix” method for the analysis of biological network structure. The method allows inferring hidden causal relations between the members of a signaling pathway or a functionally related group of genes. We investigate how the structure of hidden causal relations can be reprogrammed as a result of changes in the transcriptional network layer during cancerogenesis. The suggested Googlomics approach rigorously characterizes complex systemic changes in the wiring of large causal biological networks in a computationally efficient way. PMID:29370181
Galán, Severino F; Mengshoel, Ole J
2009-01-01
Constraints occur in many application areas of interest to evolutionary computation. The area considered here is Bayesian networks (BNs), which is a probability-based method for representing and reasoning with uncertain knowledge. This work deals with constraints in BNs and investigates how tournament selection can be adapted to better process such constraints in the context of abductive inference. Abductive inference in BNs consists of finding the most probable explanation given some evidence. Since exact abductive inference is NP-hard, several approximate approaches to this inference task have been developed. One of them applies evolutionary techniques in order to find optimal or close-to-optimal explanations. A problem with the traditional evolutionary approach is this: As the number of constraints determined by the zeros in the conditional probability tables grows, performance deteriorates because the number of explanations whose probability is greater than zero decreases. To minimize this problem, this paper presents and analyzes a new evolutionary approach to abductive inference in BNs. By considering abductive inference as a constraint optimization problem, the novel approach improves performance dramatically when a BN's conditional probability tables contain a significant number of zeros. Experimental results are presented comparing the performances of the traditional evolutionary approach and the approach introduced in this work. The results show that the new approach significantly outperforms the traditional one.
Inferring Transition Rates of Networks from Populations in Continuous-Time Markov Processes.
Dixit, Purushottam D; Jain, Abhinav; Stock, Gerhard; Dill, Ken A
2015-11-10
We are interested inferring rate processes on networks. In particular, given a network's topology, the stationary populations on its nodes, and a few global dynamical observables, can we infer all the transition rates between nodes? We draw inferences using the principle of maximum caliber (maximum path entropy). We have previously derived results for discrete-time Markov processes. Here, we treat continuous-time processes, such as dynamics among metastable states of proteins. The present work leads to a particularly important analytical result: namely, that when the network is constrained only by a mean jump rate, the rate matrix is given by a square-root dependence of the rate, kab ∝ (πb/πa)(1/2), on πa and πb, the stationary-state populations at nodes a and b. This leads to a fast way to estimate all of the microscopic rates in the system. As an illustration, we show that the method accurately predicts the nonequilibrium transition rates in an in silico gene expression network and transition probabilities among the metastable states of a small peptide at equilibrium. We note also that the method makes sensible predictions for so-called extra-thermodynamic relationships, such as those of Bronsted, Hammond, and others.
NetDiff - Bayesian model selection for differential gene regulatory network inference.
Thorne, Thomas
2016-12-16
Differential networks allow us to better understand the changes in cellular processes that are exhibited in conditions of interest, identifying variations in gene regulation or protein interaction between, for example, cases and controls, or in response to external stimuli. Here we present a novel methodology for the inference of differential gene regulatory networks from gene expression microarray data. Specifically we apply a Bayesian model selection approach to compare models of conserved and varying network structure, and use Gaussian graphical models to represent the network structures. We apply a variational inference approach to the learning of Gaussian graphical models of gene regulatory networks, that enables us to perform Bayesian model selection that is significantly more computationally efficient than Markov Chain Monte Carlo approaches. Our method is demonstrated to be more robust than independent analysis of data from multiple conditions when applied to synthetic network data, generating fewer false positive predictions of differential edges. We demonstrate the utility of our approach on real world gene expression microarray data by applying it to existing data from amyotrophic lateral sclerosis cases with and without mutations in C9orf72, and controls, where we are able to identify differential network interactions for further investigation.
Inference of biological pathway from gene expression profiles by time delay boolean networks.
Directory of Open Access Journals (Sweden)
Tung-Hung Chueh
Full Text Available One great challenge of genomic research is to efficiently and accurately identify complex gene regulatory networks. The development of high-throughput technologies provides numerous experimental data such as DNA sequences, protein sequence, and RNA expression profiles makes it possible to study interactions and regulations among genes or other substance in an organism. However, it is crucial to make inference of genetic regulatory networks from gene expression profiles and protein interaction data for systems biology. This study will develop a new approach to reconstruct time delay boolean networks as a tool for exploring biological pathways. In the inference strategy, we will compare all pairs of input genes in those basic relationships by their corresponding p-scores for every output gene. Then, we will combine those consistent relationships to reveal the most probable relationship and reconstruct the genetic network. Specifically, we will prove that O(log n state transition pairs are sufficient and necessary to reconstruct the time delay boolean network of n nodes with high accuracy if the number of input genes to each gene is bounded. We also have implemented this method on simulated and empirical yeast gene expression data sets. The test results show that this proposed method is extensible for realistic networks.
The feasibility of genome-scale biological network inference using Graphics Processing Units.
Thiagarajan, Raghuram; Alavi, Amir; Podichetty, Jagdeep T; Bazil, Jason N; Beard, Daniel A
2017-01-01
Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called 'big data' applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications. For example, both inverse identification and forward simulations of genome-scale gene regulatory network models pose compute-intensive problems. This issue is addressed here by combining the processing power of Graphics Processing Units (GPUs) and a parallel reverse engineering algorithm for inference of regulatory networks. It is shown that, given an appropriate data set, information on genome-scale networks (systems of 1000 or more state variables) can be inferred using a reverse-engineering algorithm in a matter of days on a small-scale modern GPU cluster.
Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.
Xiao, Fei; Gao, Lin; Ye, Yusen; Hu, Yuxuan; He, Ruijie
2016-01-01
Combining path consistency (PC) algorithms with conditional mutual information (CMI) are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference), to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.
Directory of Open Access Journals (Sweden)
Saito Shigeru
2007-01-01
Full Text Available Hepatocellular carcinoma (HCC in a liver with advanced-stage chronic hepatitis C (CHC is induced by hepatitis C virus, which chronically infects about 170 million people worldwide. To elucidate the associations between gene groups in hepatocellular carcinogenesis, we analyzed the profiles of the genes characteristically expressed in the CHC and HCC cell stages by a statistical method for inferring the network between gene systems based on the graphical Gaussian model. A systematic evaluation of the inferred network in terms of the biological knowledge revealed that the inferred network was strongly involved in the known gene-gene interactions with high significance , and that the clusters characterized by different cancer-related responses were associated with those of the gene groups related to metabolic pathways and morphological events. Although some relationships in the network remain to be interpreted, the analyses revealed a snapshot of the orchestrated expression of cancer-related groups and some pathways related with metabolisms and morphological events in hepatocellular carcinogenesis, and thus provide possible clues on the disease mechanism and insights that address the gap between molecular and clinical assessments.
Reveal, A General Reverse Engineering Algorithm for Inference of Genetic Network Architectures
Liang, Shoudan; Fuhrman, Stefanie; Somogyi, Roland
1998-01-01
Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.
Approximation and inference methods for stochastic biochemical kinetics—a tutorial review
Schnoerr, David; Sanguinetti, Guido; Grima, Ramon
2017-03-01
Stochastic fluctuations of molecule numbers are ubiquitous in biological systems. Important examples include gene expression and enzymatic processes in living cells. Such systems are typically modelled as chemical reaction networks whose dynamics are governed by the chemical master equation. Despite its simple structure, no analytic solutions to the chemical master equation are known for most systems. Moreover, stochastic simulations are computationally expensive, making systematic analysis and statistical inference a challenging task. Consequently, significant effort has been spent in recent decades on the development of efficient approximation and inference methods. This article gives an introduction to basic modelling concepts as well as an overview of state of the art methods. First, we motivate and introduce deterministic and stochastic methods for modelling chemical networks, and give an overview of simulation and exact solution methods. Next, we discuss several approximation methods, including the chemical Langevin equation, the system size expansion, moment closure approximations, time-scale separation approximations and hybrid methods. We discuss their various properties and review recent advances and remaining challenges for these methods. We present a comparison of several of these methods by means of a numerical case study and highlight some of their respective advantages and disadvantages. Finally, we discuss the problem of inference from experimental data in the Bayesian framework and review recent methods developed the literature. In summary, this review gives a self-contained introduction to modelling, approximations and inference methods for stochastic chemical kinetics.
Classical methods for interpreting objective function minimization as intelligent inference
Energy Technology Data Exchange (ETDEWEB)
Golden, R.M. [Univ. of Texas, Dallas, TX (United States)
1996-12-31
Most recognition algorithms and neural networks can be formally viewed as seeking a minimum value of an appropriate objective function during either classification or learning phases. The goal of this paper is to argue that in order to show a recognition algorithm is making intelligent inferences, it is not sufficient to show that the recognition algorithm is computing (or trying to compute) the global minimum of some objective function. One must explicitly define a {open_quotes}relational system{close_quotes} for the recognition algorithm or neural network which identifies the: (i) sample space, (ii) the relevant sigmafield of events generated by the sample space, and (iii) the {open_quotes}relation{close_quotes} for that relational system. Only when such a {open_quotes}relational system{close_quotes} is properly defined, is it possible to formally establish the sense in which computing the global minimum of an objective function is an intelligent, inference.
Inferring the mesoscale structure of layered, edge-valued, and time-varying networks
Peixoto, Tiago P.
2015-10-01
Many network systems are composed of interdependent but distinct types of interactions, which cannot be fully understood in isolation. These different types of interactions are often represented as layers, attributes on the edges, or as a time dependence of the network structure. Although they are crucial for a more comprehensive scientific understanding, these representations offer substantial challenges. Namely, it is an open problem how to precisely characterize the large or mesoscale structure of network systems in relation to these additional aspects. Furthermore, the direct incorporation of these features invariably increases the effective dimension of the network description, and hence aggravates the problem of overfitting, i.e., the use of overly complex characterizations that mistake purely random fluctuations for actual structure. In this work, we propose a robust and principled method to tackle these problems, by constructing generative models of modular network structure, incorporating layered, attributed and time-varying properties, as well as a nonparametric Bayesian methodology to infer the parameters from data and select the most appropriate model according to statistical evidence. We show that the method is capable of revealing hidden structure in layered, edge-valued, and time-varying networks, and that the most appropriate level of granularity with respect to the additional dimensions can be reliably identified. We illustrate our approach on a variety of empirical systems, including a social network of physicians, the voting correlations of deputies in the Brazilian national congress, the global airport network, and a proximity network of high-school students.
Perkins, Sarah E; Cagnacci, Francesca; Stradiotto, Anna; Arnoldi, Daniele; Hudson, Peter J
2009-09-01
1. Social network analyses tend to focus on human interactions. However, there is a burgeoning interest in applying graph theory to ecological data from animal populations. Here we show how radio-tracking and capture-mark-recapture data collated from wild rodent populations can be used to generate contact networks. 2. Both radio-tracking and capture-mark-recapture were undertaken simultaneously. Contact networks were derived and the following statistics estimated: mean-contact rate, edge distribution, connectance and centrality. 3. Capture-mark-recapture networks produced more informative and complete networks when the rodent density was high and radio-tracking produced more informative networks when the density was low. Different data collection methods provide more data when certain ecological characteristics of the population prevail. 4. Both sets of data produced networks with comparable edge (contact) distributions that were best described by a negative binomial distribution. Connectance and closeness were statistically different between the two data sets. Only betweenness was comparable. The differences between the networks have important consequences for the transmission of infectious diseases. Care should be taken when extrapolating social networks to transmission networks for inferring disease dynamics.
Kimura, Shuhei; Sato, Masanao; Okada-Hatakeyama, Mariko
2013-01-01
The inference of a genetic network is a problem in which mutual interactions among genes are inferred from time-series of gene expression levels. While a number of models have been proposed to describe genetic networks, this study focuses on a mathematical model proposed by Vohradský. Because of its advantageous features, several researchers have proposed the inference methods based on Vohradský's model. When trying to analyze large-scale networks consisting of dozens of genes, however, these methods must solve high-dimensional non-linear function optimization problems. In order to resolve the difficulty of estimating the parameters of the Vohradský's model, this study proposes a new method that defines the problem as several two-dimensional function optimization problems. Through numerical experiments on artificial genetic network inference problems, we showed that, although the computation time of the proposed method is not the shortest, the method has the ability to estimate parameters of Vohradský's models more effectively with sufficiently short computation times. This study then applied the proposed method to an actual inference problem of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations.
Instrumental variable methods for causal inference.
Baiocchi, Michael; Cheng, Jing; Small, Dylan S
2014-06-15
A goal of many health studies is to determine the causal effect of a treatment or intervention on health outcomes. Often, it is not ethically or practically possible to conduct a perfectly randomized experiment, and instead, an observational study must be used. A major challenge to the validity of observational studies is the possibility of unmeasured confounding (i.e., unmeasured ways in which the treatment and control groups differ before treatment administration, which also affect the outcome). Instrumental variables analysis is a method for controlling for unmeasured confounding. This type of analysis requires the measurement of a valid instrumental variable, which is a variable that (i) is independent of the unmeasured confounding; (ii) affects the treatment; and (iii) affects the outcome only indirectly through its effect on the treatment. This tutorial discusses the types of causal effects that can be estimated by instrumental variables analysis; the assumptions needed for instrumental variables analysis to provide valid estimates of causal effects and sensitivity analysis for those assumptions; methods of estimation of causal effects using instrumental variables; and sources of instrumental variables in health studies. Copyright © 2014 John Wiley & Sons, Ltd.
Knowledge-guided fuzzy logic modeling to infer cellular signaling networks from proteomic data
Liu, Hui; Zhang, Fan; Mishra, Shital Kumar; Zhou, Shuigeng; Zheng, Jie
2016-10-01
Modeling of signaling pathways is crucial for understanding and predicting cellular responses to drug treatments. However, canonical signaling pathways curated from literature are seldom context-specific and thus can hardly predict cell type-specific response to external perturbations; purely data-driven methods also have drawbacks such as limited biological interpretability. Therefore, hybrid methods that can integrate prior knowledge and real data for network inference are highly desirable. In this paper, we propose a knowledge-guided fuzzy logic network model to infer signaling pathways by exploiting both prior knowledge and time-series data. In particular, the dynamic time warping algorithm is employed to measure the goodness of fit between experimental and predicted data, so that our method can model temporally-ordered experimental observations. We evaluated the proposed method on a synthetic dataset and two real phosphoproteomic datasets. The experimental results demonstrate that our model can uncover drug-induced alterations in signaling pathways in cancer cells. Compared with existing hybrid models, our method can model feedback loops so that the dynamical mechanisms of signaling networks can be uncovered from time-series data. By calibrating generic models of signaling pathways against real data, our method supports precise predictions of context-specific anticancer drug effects, which is an important step towards precision medicine.
Penfold, Christopher A; Shifaz, Ahmed; Brown, Paul E; Nicholson, Ann; Wild, David L
2015-06-01
Here we introduce the causal structure identification (CSI) package, a Gaussian process based approach to inferring gene regulatory networks (GRNs) from multiple time series data. The standard CSI approach infers a single GRN via joint learning from multiple time series datasets; the hierarchical approach (HCSI) infers a separate GRN for each dataset, albeit with the networks constrained to favor similar structures, allowing for the identification of context specific networks. The software is implemented in MATLAB and includes a graphical user interface (GUI) for user friendly inference. Finally the GUI can be connected to high performance computer clusters to facilitate analysis of large genomic datasets.
Inferring low-dimensional microstructure representations using convolutional neural networks
Lubbers, Nicholas; Lookman, Turab; Barros, Kipton
2017-11-01
We apply recent advances in machine learning and computer vision to a central problem in materials informatics: the statistical representation of microstructural images. We use activations in a pretrained convolutional neural network to provide a high-dimensional characterization of a set of synthetic microstructural images. Next, we use manifold learning to obtain a low-dimensional embedding of this statistical characterization. We show that the low-dimensional embedding extracts the parameters used to generate the images. According to a variety of metrics, the convolutional neural network method yields dramatically better embeddings than the analogous method derived from two-point correlations alone.
Predictive minimum description length principle approach to inferring gene regulatory networks.
Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping
2011-01-01
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
Recursive regularization for inferring gene networks from time-course gene expression profiles
Directory of Open Access Journals (Sweden)
Nagasaki Masao
2009-04-01
Full Text Available Abstract Background Inferring gene networks from time-course microarray experiments with vector autoregressive (VAR model is the process of identifying functional associations between genes through multivariate time series. This problem can be cast as a variable selection problem in Statistics. One of the promising methods for variable selection is the elastic net proposed by Zou and Hastie (2005. However, VAR modeling with the elastic net succeeds in increasing the number of true positives while it also results in increasing the number of false positives. Results By incorporating relative importance of the VAR coefficients into the elastic net, we propose a new class of regularization, called recursive elastic net, to increase the capability of the elastic net and estimate gene networks based on the VAR model. The recursive elastic net can reduce the number of false positives gradually by updating the importance. Numerical simulations and comparisons demonstrate that the proposed method succeeds in reducing the number of false positives drastically while keeping the high number of true positives in the network inference and achieves two or more times higher true discovery rate (the proportion of true positives among the selected edges than the competing methods even when the number of time points is small. We also compared our method with various reverse-engineering algorithms on experimental data of MCF-7 breast cancer cells stimulated with two ErbB ligands, EGF and HRG. Conclusion The recursive elastic net is a powerful tool for inferring gene networks from time-course gene expression profiles.
Wang, Zhuo; Danziger, Samuel A; Heavner, Benjamin D; Ma, Shuyi; Smith, Jennifer J; Li, Song; Herricks, Thurston; Simeonidis, Evangelos; Baliga, Nitin S; Aitchison, John D; Price, Nathan D
2017-05-01
Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
Approximation and inference methods for stochastic biochemical kinetics - a tutorial review
Schnoerr, David; Grima, Ramon
2016-01-01
Stochastic fluctuations of molecule numbers are ubiquitous in biological systems. Important examples include gene expression and enzymatic processes in living cells. Such systems are typically modelled as chemical reaction networks whose dynamics are governed by the Chemical Master Equation. Despite its simple structure, no analytic solutions to the Chemical Master Equation are known for most systems. Moreover, stochastic simulations are computationally expensive, making systematic analysis and statistical inference a challenging task. Consequently, significant effort has been spent in recent decades on the development of efficient approximation and inference methods. This article gives an introduction to basic modelling concepts as well as an overview of state of the art methods. First, we motivate and introduce deterministic and stochastic models for chemical networks, and give an overview of simulation and exact solution methods. Next, we discuss several approximation methods, including the chemical Langev...
Connectivity in the yeast cell cycle transcription network: inferences from neural networks.
Directory of Open Access Journals (Sweden)
Christopher E Hart
2006-12-01
Full Text Available A current challenge is to develop computational approaches to infer gene network regulatory relationships based on multiple types of large-scale functional genomic data. We find that single-layer feed-forward artificial neural network (ANN models can effectively discover gene network structure by integrating global in vivo protein:DNA interaction data (ChIP/Array with genome-wide microarray RNA data. We test this on the yeast cell cycle transcription network, which is composed of several hundred genes with phase-specific RNA outputs. These ANNs were robust to noise in data and to a variety of perturbations. They reliably identified and ranked 10 of 12 known major cell cycle factors at the top of a set of 204, based on a sum-of-squared weights metric. Comparative analysis of motif occurrences among multiple yeast species independently confirmed relationships inferred from ANN weights analysis. ANN models can capitalize on properties of biological gene networks that other kinds of models do not. ANNs naturally take advantage of patterns of absence, as well as presence, of factor binding associated with specific expression output; they are easily subjected to in silico "mutation" to uncover biological redundancies; and they can use the full range of factor binding values. A prominent feature of cell cycle ANNs suggested an analogous property might exist in the biological network. This postulated that "network-local discrimination" occurs when regulatory connections (here between MBF and target genes are explicitly disfavored in one network module (G2, relative to others and to the class of genes outside the mitotic network. If correct, this predicts that MBF motifs will be significantly depleted from the discriminated class and that the discrimination will persist through evolution. Analysis of distantly related Schizosaccharomyces pombe confirmed this, suggesting that network-local discrimination is real and complements well-known enrichment of
Inference of the genetic network regulating lateral root initiation in Arabidopsis thaliana.
Muraro, Daniele; Voβ, Ute; Wilson, Michael; Bennett, Malcolm; Byrne, Helen; De Smet, Ive; Hodgman, Charlie; King, John
2013-01-01
Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation.
Inference of the Genetic Network Regulating Lateral Root Initiation in Arabidopsis thaliana
Muraro, D.
2013-01-01
Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation. © 2004-2012 IEEE.
Jeon, Gyuhyeon
2016-01-01
A common form of competition is one where judges grade contestants' performances which are then compiled to determine the final ranking of the contestants. Unlike in another common form of competition where two contestants play a head-to-head match to produce a winner as in football or basketball, the objectivity of judges are prone to be questioned, potentially undermining the public's trust in the fairness of the competition. In this work we show, by modeling the judge--contestant competition as a weighted bipartite network, how we can identify biased scores and measure their impact on our inference of the network structure. Analyzing the prestigious International Chopin Piano Competition of 2015 with a well-publicized scoring controversy as an example, we show that even a single statistically uncharacteristic score can be enough to gravely distort our inference of the community structure, demonstrating the importance of detecting and eliminating biases. In the process we also find that there does not exist...
Comparative Study of Inference Methods for Bayesian Nonnegative Matrix Factorisation
DEFF Research Database (Denmark)
Brouwer, Thomas; Frellsen, Jes; Liò, Pietro
2017-01-01
In this paper, we study the trade-offs of different inference approaches for Bayesian matrix factorisation methods, which are commonly used for predicting missing values, and for finding patterns in the data. In particular, we consider Bayesian nonnegative variants of matrix factorisation and tri...
Explanation in causal inference methods for mediation and interaction
VanderWeele, Tyler
2015-01-01
A comprehensive examination of methods for mediation and interaction, VanderWeele's book is the first to approach this topic from the perspective of causal inference. Numerous software tools are provided, and the text is both accessible and easy to read, with examples drawn from diverse fields. The result is an essential reference for anyone conducting empirical research in the biomedical or social sciences.
Inferring phage-bacteria infection networks from time-series data.
Jover, Luis F; Romberg, Justin; Weitz, Joshua S
2016-11-01
In communities with bacterial viruses (phage) and bacteria, the phage-bacteria infection network establishes which virus types infect which host types. The structure of the infection network is a key element in understanding community dynamics. Yet, this infection network is often difficult to ascertain. Introduced over 60 years ago, the plaque assay remains the gold standard for establishing who infects whom in a community. This culture-based approach does not scale to environmental samples with increased levels of phage and bacterial diversity, much of which is currently unculturable. Here, we propose an alternative method of inferring phage-bacteria infection networks. This method uses time-series data of fluctuating population densities to estimate the complete interaction network without having to test each phage-bacteria pair individually. We use in silico experiments to analyse the factors affecting the quality of network reconstruction and find robust regimes where accurate reconstructions are possible. In addition, we present a multi-experiment approach where time series from different experiments are combined to improve estimates of the infection network. This approach also mitigates against the possibility of evolutionary changes to relevant phenotypes during the time course of measurement.
Inferring phage–bacteria infection networks from time-series data
Jover, Luis F.; Romberg, Justin
2016-01-01
In communities with bacterial viruses (phage) and bacteria, the phage–bacteria infection network establishes which virus types infect which host types. The structure of the infection network is a key element in understanding community dynamics. Yet, this infection network is often difficult to ascertain. Introduced over 60 years ago, the plaque assay remains the gold standard for establishing who infects whom in a community. This culture-based approach does not scale to environmental samples with increased levels of phage and bacterial diversity, much of which is currently unculturable. Here, we propose an alternative method of inferring phage–bacteria infection networks. This method uses time-series data of fluctuating population densities to estimate the complete interaction network without having to test each phage–bacteria pair individually. We use in silico experiments to analyse the factors affecting the quality of network reconstruction and find robust regimes where accurate reconstructions are possible. In addition, we present a multi-experiment approach where time series from different experiments are combined to improve estimates of the infection network. This approach also mitigates against the possibility of evolutionary changes to relevant phenotypes during the time course of measurement. PMID:28018655
A novel gene network inference algorithm using predictive minimum description length approach.
Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang
2010-05-28
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the
Energy Technology Data Exchange (ETDEWEB)
Castro, Adriana R. Garcez; Miranda, Vladimiro [Instituto de Engenharia de Sistemas e Computadores do Porto, INESC Porto (Portugal)
2005-12-01
An artificial neural network concept has been developed for transformer fault diagnosis using dissolved gas-in-oil analysis (DGA). A new methodology for mapping the neural network into a rule-based inference system is described. This mapping makes explicit the knowledge implicitly captured by the neural network during the learning stage, by transforming it into a Fuzzy Inference System. Some studies are reported, illustrating the good results obtained. (author)
Guo, Wenbin; Calixto, Cristiane P G; Tzioutziou, Nikoleta; Lin, Ping; Waugh, Robbie; Brown, John W S; Zhang, Runxuan
2017-06-19
Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co
Inferring animal social networks and leadership: applications for passive monitoring arrays.
Jacoby, David M P; Papastamatiou, Yannis P; Freeman, Robin
2016-11-01
Analyses of animal social networks have frequently benefited from techniques derived from other disciplines. Recently, machine learning algorithms have been adopted to infer social associations from time-series data gathered using remote, telemetry systems situated at provisioning sites. We adapt and modify existing inference methods to reveal the underlying social structure of wide-ranging marine predators moving through spatial arrays of passive acoustic receivers. From six months of tracking data for grey reef sharks (Carcharhinus amblyrhynchos) at Palmyra atoll in the Pacific Ocean, we demonstrate that some individuals emerge as leaders within the population and that this behavioural coordination is predicted by both sex and the duration of co-occurrences between conspecifics. In doing so, we provide the first evidence of long-term, spatially extensive social processes in wild sharks. To achieve these results, we interrogate simulated and real tracking data with the explicit purpose of drawing attention to the key considerations in the use and interpretation of inference methods and their impact on resultant social structure. We provide a modified translation of the GMMEvents method for R, including new analyses quantifying the directionality and duration of social events with the aim of encouraging the careful use of these methods more widely in less tractable social animal systems but where passive telemetry is already widespread. © 2016 The Authors.
Probabilistic reasoning in intelligent systems networks of plausible inference
Pearl, Judea
1988-01-01
Probabilistic Reasoning in Intelligent Systems is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertainty--and offers techniques, based on belief networks, that provid
Wang, Chen; Xuan, Jianhua; Shih, Ie-Ming; Clarke, Robert; Wang, Yue
2012-08-01
With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise-ratio (SNR) is low, but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on E. coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.
Inferring causal phenotype networks using structural equation models
Directory of Open Access Journals (Sweden)
de los Campos Gustavo
2011-02-01
Full Text Available Abstract Phenotypic traits may exert causal effects between them. For example, on the one hand, high yield in dairy cows may increase the liability to certain diseases and, on the other hand, the incidence of a disease may affect yield negatively. Likewise, the transcriptome may be a function of the reproductive status in mammals and the latter may depend on other physiological variables. Knowledge of phenotype networks describing such interrelationships can be used to predict the behavior of complex systems, e.g. biological pathways underlying complex traits such as diseases, growth and reproduction. Structural Equation Models (SEM can be used to study recursive and simultaneous relationships among phenotypes in multivariate systems such as genetical genomics, system biology, and multiple trait models in quantitative genetics. Hence, SEM can produce an interpretation of relationships among traits which differs from that obtained with traditional multiple trait models, in which all relationships are represented by symmetric linear associations among random variables, such as covariances and correlations. In this review, we discuss the application of SEM and related techniques for the study of multiple phenotypes. Two basic scenarios are considered, one pertaining to genetical genomics studies, in which QTL or molecular marker information is used to facilitate causal inference, and another related to quantitative genetic analysis in livestock, in which only phenotypic and pedigree information is available. Advantages and limitations of SEM compared to traditional approaches commonly used for the analysis of multiple traits, as well as some indication of future research in this area are presented in a concluding section.
Directory of Open Access Journals (Sweden)
Ricardo de Matos Simoes
Full Text Available The inference of gene regulatory networks from gene expression data is a difficult problem because the performance of the inference algorithms depends on a multitude of different factors. In this paper we study two of these. First, we investigate the influence of discrete mutual information (MI estimators on the global and local network inference performance of the C3NET algorithm. More precisely, we study 4 different MI estimators (Empirical, Miller-Madow, Shrink and Schürmann-Grassberger in combination with 3 discretization methods (equal frequency, equal width and global equal width discretization. We observe the best global and local inference performance of C3NET for the Miller-Madow estimator with an equal width discretization. Second, our numerical analysis can be considered as a systems approach because we simulate gene expression data from an underlying gene regulatory network, instead of making a distributional assumption to sample thereof. We demonstrate that despite the popularity of the latter approach, which is the traditional way of studying MI estimators, this is in fact not supported by simulated and biological expression data because of their heterogeneity. Hence, our study provides guidance for an efficient design of a simulation study in the context of network inference, supporting a systems approach.
Multi-Agent Inference in Social Networks: A Finite Population Learning Approach.
Fan, Jianqing; Tong, Xin; Zeng, Yao
When people in a society want to make inference about some parameter, each person may want to use data collected by other people. Information (data) exchange in social networks is usually costly, so to make reliable statistical decisions, people need to trade off the benefits and costs of information acquisition. Conflicts of interests and coordination problems will arise in the process. Classical statistics does not consider people's incentives and interactions in the data collection process. To address this imperfection, this work explores multi-agent Bayesian inference problems with a game theoretic social network model. Motivated by our interest in aggregate inference at the societal level, we propose a new concept, finite population learning, to address whether with high probability, a large fraction of people in a given finite population network can make "good" inference. Serving as a foundation, this concept enables us to study the long run trend of aggregate inference quality as population grows.
RulNet: A Web-Oriented Platform for Regulatory Network Inference, Application to Wheat -Omics Data.
Vincent, Jonathan; Martre, Pierre; Gouriou, Benjamin; Ravel, Catherine; Dai, Zhanwu; Petit, Jean-Marc; Pailloux, Marie
2015-01-01
With the increasing amount of -omics data available, a particular effort has to be made to provide suitable analysis tools. A major challenge is that of unraveling the molecular regulatory networks from massive and heterogeneous datasets. Here we describe RulNet, a web-oriented platform dedicated to the inference and analysis of regulatory networks from qualitative and quantitative -omics data by means of rule discovery. Queries for rule discovery can be written in an extended form of the RQL query language, which has a syntax similar to SQL. RulNet also offers users interactive features that progressively adjust and refine the inferred networks. In this paper, we present a functional characterization of RulNet and compare inferred networks with correlation-based approaches. The performance of RulNet has been evaluated using the three benchmark datasets used for the transcriptional network inference challenge DREAM5. Overall, RulNet performed as well as the best methods that participated in this challenge and it was shown to behave more consistently when compared across the three datasets. Finally, we assessed the suitability of RulNet to analyze experimental -omics data and to infer regulatory networks involved in the response to nitrogen and sulfur supply in wheat (Triticum aestivum L.) grains. The results highlight putative actors governing the response to nitrogen and sulfur supply in wheat grains. We evaluate the main characteristics and features of RulNet as an all-in-one solution for RN inference, visualization and editing. Using simple yet powerful RulNet queries allowed RNs involved in the adaptation of wheat grain to N and S supply to be discovered. We demonstrate the effectiveness and suitability of RulNet as a platform for the analysis of RNs involving different types of -omics data. The results are promising since they are consistent with what was previously established by the scientific community.
Inferring Rationales from Choice : Identification for Rational Shortlist Methods
Rohan Dutta; Sean Horan
2013-01-01
A wide variety of choice behavior inconsistent with preference maximization can be explained by Manzini and Mariotti's Rational Shortlist Methods. Choices are made by sequentially applying a pair of asymmetric binary relations (rationales) to eliminate inferior alternatives. Manzini and Mariotti's axiomatic treatment elegantly describes which behavior can be explained by this model. However, it leaves unanswered what can be inferred, from observed behavior, about the underlying rationales. Es...
Directory of Open Access Journals (Sweden)
Markku O. Kuismin
2017-10-01
Full Text Available Estimation of genetic population structure based on molecular markers is a common task in population genetics and ecology. We apply a generalized linear model with LASSO regularization to infer relationships between individuals and populations from molecular marker data. Specifically, we apply a neighborhood selection algorithm to infer population genetic structure and gene flow between populations. The resulting relationships are used to construct an individual-level population graph. Different network substructures known as communities are then dissociated from each other using a community detection algorithm. Inference of population structure using networks combines the good properties of: (i network theory (broad collection of tools, including aesthetically pleasing visualization, (ii principal component analysis (dimension reduction together with simple visual inspection, and (iii model-based methods (e.g., ancestry coefficient estimates. We have named our process CONE (for community oriented network estimation. CONE has fewer restrictions than conventional assignment methods in that properties such as the number of subpopulations need not be fixed before the analysis and the sample may include close relatives or involve uneven sampling. Applying CONE on simulated data sets resulted in more accurate estimates of the true number of subpopulations than model-based methods, and provided comparable ancestry coefficient estimates. Inference of empirical data sets of teosinte single nucleotide polymorphism, bacterial disease outbreak, and the human genome diversity panel illustrate that population structures estimated with CONE are consistent with the earlier findings
Kuismin, Markku O; Ahlinder, Jon; Sillanpӓӓ, Mikko J
2017-10-05
Estimation of genetic population structure based on molecular markers is a common task in population genetics and ecology. We apply a generalized linear model with LASSO regularization to infer relationships between individuals and populations from molecular marker data. Specifically, we apply a neighborhood selection algorithm to infer population genetic structure and gene flow between populations. The resulting relationships are used to construct an individual-level population graph. Different network substructures known as communities are then dissociated from each other using a community detection algorithm. Inference of population structure using networks combines the good properties of: (i) network theory (broad collection of tools, including aesthetically pleasing visualization), (ii) principal component analysis (dimension reduction together with simple visual inspection), and (iii) model-based methods (e.g., ancestry coefficient estimates). We have named our process CONE (for community oriented network estimation). CONE has fewer restrictions than conventional assignment methods in that properties such as the number of subpopulations need not be fixed before the analysis and the sample may include close relatives or involve uneven sampling. Applying CONE on simulated data sets resulted in more accurate estimates of the true number of subpopulations than model-based methods, and provided comparable ancestry coefficient estimates. Inference of empirical data sets of teosinte single nucleotide polymorphism, bacterial disease outbreak, and the human genome diversity panel illustrate that population structures estimated with CONE are consistent with the earlier findings. Copyright © 2017 Kuismin et al.
Directory of Open Access Journals (Sweden)
Benjamin W. Y. Lo
2013-01-01
Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Efficient design and inference in distributed Bayesian networks: an overview
de Oude, P.; Groen, F.C.A.; Pavlin, G.; Bezhanishvili, N.; Löbner, S.; Schwabe, K.; Spada, L.
2011-01-01
This paper discusses an approach to distributed Bayesian modeling and inference, which is relevant for an important class of contemporary real world situation assessment applications. By explicitly considering the locality of causal relations, the presented approach (i) supports coherent distributed
Directory of Open Access Journals (Sweden)
Oliver Ratmann
2007-11-01
Full Text Available Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication-divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene
Directory of Open Access Journals (Sweden)
Yi Kan Wang
Full Text Available We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.
Integrated Inference and Analysis of Regulatory Networks from Multi-Level Measurements
Poultney, Christopher S.; Greenfield, Alex; Bonneau, Richard
2017-01-01
Regulatory and signaling networks coordinate the enormously complex interactions and processes that control cellular processes (such as metabolism and cell division), coordinate response to the environment, and carry out multiple cell decisions (such as development and quorum sensing). Regulatory network inference is the process of inferring these networks, traditionally from microarray data but increasingly incorporating other measurement types such as proteomics, ChIP-seq, metabolomics, and mass cytometry. We discuss existing techniques for network inference. We review in detail our pipeline, which consists of an initial biclustering step, designed to estimate co-regulated groups; a network inference step, designed to select and parameterize likely regulatory models for the control of the co-regulated groups from the biclustering step; and a visualization and analysis step, designed to find and communicate key features of the network. Learning biological networks from even the most complete data sets is challenging; we argue that integrating new data types into the inference pipeline produces networks of increased accuracy, validity, and biological relevance. PMID:22482944
Evaluation of a Bayesian inference network for ligand-based virtual screening
Directory of Open Access Journals (Sweden)
Chen Beining
2009-04-01
Full Text Available Abstract Background Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity. Results Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought. Conclusion A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening.
Orhan, A Emin; Ma, Wei Ji
2017-07-26
Animals perform near-optimal probabilistic inference in a wide range of psychophysical tasks. Probabilistic inference requires trial-to-trial representation of the uncertainties associated with task variables and subsequent use of this representation. Previous work has implemented such computations using neural networks with hand-crafted and task-dependent operations. We show that generic neural networks trained with a simple error-based learning rule perform near-optimal probabilistic inference in nine common psychophysical tasks. In a probabilistic categorization task, error-based learning in a generic network simultaneously explains a monkey's learning curve and the evolution of qualitative aspects of its choice behavior. In all tasks, the number of neurons required for a given level of performance grows sublinearly with the input population size, a substantial improvement on previous implementations of probabilistic inference. The trained networks develop a novel sparsity-based probabilistic population code. Our results suggest that probabilistic inference emerges naturally in generic neural networks trained with error-based learning rules.Behavioural tasks often require probability distributions to be inferred about task specific variables. Here, the authors demonstrate that generic neural networks can be trained using a simple error-based learning rule to perform such probabilistic computations efficiently without any need for task specific operations.
Directory of Open Access Journals (Sweden)
Nandkumar Wagh
2014-01-01
Full Text Available Continuity of power supply is of utmost importance to the consumers and is only possible by coordination and reliable operation of power system components. Power transformer is such a prime equipment of the transmission and distribution system and needs to be continuously monitored for its well-being. Since ratio methods cannot provide correct diagnosis due to the borderline problems and the probability of existence of multiple faults, artificial intelligence could be the best approach. Dissolved gas analysis (DGA interpretation may provide an insight into the developing incipient faults and is adopted as the preliminary diagnosis tool. In the proposed work, a comparison of the diagnosis ability of backpropagation (BP, radial basis function (RBF neural network, and adaptive neurofuzzy inference system (ANFIS has been investigated and the diagnosis results in terms of error measure, accuracy, network training time, and number of iterations are presented.
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-01-01
Full Text Available We consider a task of local posteriori inference description by means of matrix-vector equations in algebraical Bayesian networks that represent a class of probabilistic graphical models. Such equations were generally presented in previous publications, however containing normalizing factors that were provided with algorithmic descriptions of their calculations instead of the desired matrix-vector interpretation. To eliminate this gap, the normalized factors were firstly represented as scalar products. Then, it was successfully shown that one of the components in each scalar product can be expressed as a Kronecker degree of a constant two-dimensional vector. Later on, non-normalized posteriori inference matrixoperator transplantation and further transfer within each scalar product yielded a representation of one of the scalar product components as a sequence of tensor products of two-dimensional vectors. The latter vectors have only two possible values in one case and three values in the other. The choice among those values is determined by the structure of input evidence. The second component of each scalar products is the vector with original data. The calculations performed gave the possibility for constructing corresponding vectors; the paper contains a table with proper examples for some of them. Local posteriori inference representation for matrix-vector equations simplify the development of local posteriori inference algorithms, their verification and further implementation based on available libraries. These equations also give the possibility for application of classical mathematical techniques to the obtained results analysis. Finally, the results obtained make it possible to apply the method of postponed calculations. This method helps avoiding construction of big-size vectors; instead, the vectors components can be calculated just in time they are needed by means of bitwise operations.
Network transfer entropy and metric space for causality inference.
Banerji, Christopher R S; Severini, Simone; Teschendorff, Andrew E
2013-05-01
A measure is derived to quantify directed information transfer between pairs of vertices in a weighted network, over paths of a specified maximal length. Our approach employs a general, probabilistic model of network traffic, from which the informational distance between dynamics on two weighted networks can be naturally expressed as a Jensen Shannon divergence. Our network transfer entropy measure is shown to be able to distinguish and quantify causal relationships between network elements, in applications to simple synthetic networks and a biological signaling network. We conclude with a theoretical extension of our framework, in which the square root of the Jensen Shannon Divergence induces a metric on the space of dynamics on weighted networks. We prove a convergence criterion, demonstrating that a form of convergence in the structure of weighted networks in a family of matrix metric spaces implies convergence of their dynamics with respect to the square root Jensen Shannon divergence metric.
Empirically determining the sample size for large-scale gene network inference algorithms.
Altay, G
2012-04-01
The performance of genome-wide gene regulatory network inference algorithms depends on the sample size. It is generally considered that the larger the sample size, the better the gene network inference performance. Nevertheless, there is not adequate information on determining the sample size for optimal performance. In this study, the author systematically demonstrates the effect of sample size on information-theory-based gene network inference algorithms with an ensemble approach. The empirical results showed that the inference performances of the considered algorithms tend to converge after a particular sample size region. As a specific example, the sample size region around ≃64 is sufficient to obtain the most of the inference performance with respect to precision using the representative algorithm C3NET on the synthetic steady-state data sets of Escherichia coli and also time-series data set of a homo sapiens subnetworks. The author verified the convergence result on a large, real data set of E. coli as well. The results give evidence to biologists to better design experiments to infer gene networks. Further, the effect of cutoff on inference performances over various sample sizes is considered. [Includes supplementary material].
Grear, Daniel A; Luong, Lien T; Hudson, Peter J
2013-12-01
incorporating the parasite life cycle, relative to the way that exposure is measured, is key to inferring transmission and can be empirically quantified using network techniques. In addition, appropriately defining and measuring contacts according the life history of the parasite and relevant behaviors of the host is a crucial step in applying network analyses to empirical systems.
Zeng, Y; Zhang, J; Yin, H; Pan, Y
2007-01-01
Visual evoked potentials (VEPs) are time-varying signals typically buried in relatively large background noise known as the electroencephalogram (EEG). In this paper, an adaptive noise cancellation with neural network-based fuzzy inference system (NNFIS) was used and the NNFIS was carefully designed to model the VEP signal. It is assumed that VEP responses can be modelled by NNFIS with the centres of its membership functions evenly distributed over time. The weights of NNFIS are adaptively determined by minimizing the variance of the error signal using the least mean squares (LMS) algorithm. As the NNFIS is dynamic to any change of VEP, the non-stationary characteristics of VEP can be tracked. Thus, this method should be able to track the VEP. Four sets of simulated data indicate that the proposed method is appropriate to estimate VEP. A total of 150 trials are processed to demonstrate the superior performance of the proposed method.
Bayesian inference for low-rank Ising networks
Marsman, M.; Maris, Gunter; Bechger, Timo; Glas, Cornelis A.W.
2015-01-01
Estimating the structure of Ising networks is a notoriously difficult problem. We demonstrate that using a latent variable representation of the Ising network, we can employ a full-data-information approach to uncover the network structure. Thereby, only ignoring information encoded in the prior
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Directory of Open Access Journals (Sweden)
Xiangyun Xiao
Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Moraes, Alvaro
2015-01-01
Epidemics have shaped, sometimes more than wars and natural disasters, demo- graphic aspects of human populations around the world, their health habits and their economies. Ebola and the Middle East Respiratory Syndrome (MERS) are clear and current examples of potential hazards at planetary scale. During the spread of an epidemic disease, there are phenomena, like the sudden extinction of the epidemic, that can not be captured by deterministic models. As a consequence, stochastic models have been proposed during the last decades. A typical forward problem in the stochastic setting could be the approximation of the expected number of infected individuals found in one month from now. On the other hand, a typical inverse problem could be, given a discretely observed set of epidemiological data, infer the transmission rate of the epidemic or its basic reproduction number. Markovian epidemic models are stochastic models belonging to a wide class of pure jump processes known as Stochastic Reaction Networks (SRNs), that are intended to describe the time evolution of interacting particle systems where one particle interacts with the others through a finite set of reaction channels. SRNs have been mainly developed to model biochemical reactions but they also have applications in neural networks, virus kinetics, and dynamics of social networks, among others. 4 This PhD thesis is focused on novel fast simulation algorithms and statistical inference methods for SRNs. Our novel Multi-level Monte Carlo (MLMC) hybrid simulation algorithms provide accurate estimates of expected values of a given observable of SRNs at a prescribed final time. They are designed to control the global approximation error up to a user-selected accuracy and up to a certain confidence level, and with near optimal computational work. We also present novel dual-weighted residual expansions for fast estimation of weak and strong errors arising from the MLMC methodology. Regarding the statistical inference
LASSIM-A network inference toolbox for genome-wide mechanistic modeling.
Magnusson, Rasmus; Mariotti, Guido Pio; Köpsén, Mattias; Lövfors, William; Gawel, Danuta R; Jörnsten, Rebecka; Linde, Jörg; Nordling, Torbjörn E M; Nyman, Elin; Schulze, Sylvie; Nestor, Colm E; Zhang, Huan; Cedersund, Gunnar; Benson, Mikael; Tjärnberg, Andreas; Gustafsson, Mika
2017-06-01
Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM), which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE) for gene regulatory networks (GRNs). LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models with truly
Otero-Muras, Irene; Yordanov, Pencho; Stelling, Joerg
2014-11-20
Within cells, stimuli are transduced into cell responses by complex networks of biochemical reactions. In many cell decision processes the underlying networks behave as bistable switches, converting graded stimuli or inputs into all or none cell responses. Observing how systems respond to different perturbations, insight can be gained into the underlying molecular mechanisms by developing mathematical models. Emergent properties of systems, like bistability, can be exploited to this purpose. One of the main challenges in modeling intracellular processes, from signaling pathways to gene regulatory networks, is to deal with high structural and parametric uncertainty, due to the complexity of the systems and the difficulty to obtain experimental measurements. Formal methods that exploit structural properties of networks for parameter estimation can help to overcome these problems. We here propose a novel method to infer the kinetic parameters of bistable biochemical network models. Bistable systems typically show hysteretic dose response curves, in which the so called bifurcation points can be located experimentally. We exploit the fact that, at the bifurcation points, a condition for multistationarity derived in the context of the Chemical Reaction Network Theory must be fulfilled. Chemical Reaction Network Theory has attracted attention from the (systems) biology community since it connects the structure of biochemical reaction networks to qualitative properties of the corresponding model of ordinary differential equations. The inverse bifurcation method developed here allows determining the parameters that produce the expected behavior of the dose response curves and, in particular, the observed location of the bifurcation points given by experimental data. Our inverse bifurcation method exploits inherent structural properties of bistable switches in order to estimate kinetic parameters of bistable biochemical networks, opening a promising route for developments in
Peng, Yefei
2010-01-01
An ontology mapping neural network (OMNN) is proposed in order to learn and infer correspondences among ontologies. It extends the Identical Elements Neural Network (IENN)'s ability to represent and map complex relationships. The learning dynamics of simultaneous (interlaced) training of similar tasks interact at the shared connections of the…
Frank, Laurence Emmanuelle
2006-01-01
Feature Network Models (FNM) are graphical structures that represent proximity data in a discrete space with the use of features. A statistical inference theory is introduced, based on the additivity properties of networks and the linear regression framework. Considering features as predictor
Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence
Directory of Open Access Journals (Sweden)
Xiao Yufei
2007-01-01
Full Text Available The inference of gene regulatory networks is a key issue for genomic signal processing. This paper addresses the inference of probabilistic Boolean networks (PBNs from observed temporal sequences of network states. Since a PBN is composed of a finite number of Boolean networks, a basic observation is that the characteristics of a single Boolean network without perturbation may be determined by its pairwise transitions. Because the network function is fixed and there are no perturbations, a given state will always be followed by a unique state at the succeeding time point. Thus, a transition counting matrix compiled over a data sequence will be sparse and contain only one entry per line. If the network also has perturbations, with small perturbation probability, then the transition counting matrix would have some insignificant nonzero entries replacing some (or all of the zeros. If a data sequence is sufficiently long to adequately populate the matrix, then determination of the functions and inputs underlying the model is straightforward. The difficulty comes when the transition counting matrix consists of data derived from more than one Boolean network. We address the PBN inference procedure in several steps: (1 separate the data sequence into "pure" subsequences corresponding to constituent Boolean networks; (2 given a subsequence, infer a Boolean network; and (3 infer the probabilities of perturbation, the probability of there being a switch between constituent Boolean networks, and the selection probabilities governing which network is to be selected given a switch. Capturing the full dynamic behavior of probabilistic Boolean networks, be they binary or multivalued, will require the use of temporal data, and a great deal of it. This should not be surprising given the complexity of the model and the number of parameters, both transitional and static, that must be estimated. In addition to providing an inference algorithm, this paper
Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence
Directory of Open Access Journals (Sweden)
Le Yu
2007-05-01
Full Text Available The inference of gene regulatory networks is a key issue for genomic signal processing. This paper addresses the inference of probabilistic Boolean networks (PBNs from observed temporal sequences of network states. Since a PBN is composed of a finite number of Boolean networks, a basic observation is that the characteristics of a single Boolean network without perturbation may be determined by its pairwise transitions. Because the network function is fixed and there are no perturbations, a given state will always be followed by a unique state at the succeeding time point. Thus, a transition counting matrix compiled over a data sequence will be sparse and contain only one entry per line. If the network also has perturbations, with small perturbation probability, then the transition counting matrix would have some insignificant nonzero entries replacing some (or all of the zeros. If a data sequence is sufficiently long to adequately populate the matrix, then determination of the functions and inputs underlying the model is straightforward. The difficulty comes when the transition counting matrix consists of data derived from more than one Boolean network. We address the PBN inference procedure in several steps: (1 separate the data sequence into Ã‚Â“pureÃ‚Â” subsequences corresponding to constituent Boolean networks; (2 given a subsequence, infer a Boolean network; and (3 infer the probabilities of perturbation, the probability of there being a switch between constituent Boolean networks, and the selection probabilities governing which network is to be selected given a switch. Capturing the full dynamic behavior of probabilistic Boolean networks, be they binary or multivalued, will require the use of temporal data, and a great deal of it. This should not be surprising given the complexity of the model and the number of parameters, both transitional and static, that must be estimated. In addition to providing an inference algorithm
Inferring structural connectivity using Ising couplings in models of neuronal networks.
Kadirvelu, Balasundaram; Hayashi, Yoshikatsu; Nasuto, Slawomir J
2017-08-15
Functional connectivity metrics have been widely used to infer the underlying structural connectivity in neuronal networks. Maximum entropy based Ising models have been suggested to discount the effect of indirect interactions and give good results in inferring the true anatomical connections. However, no benchmarking is currently available to assess the performance of Ising couplings against other functional connectivity metrics in the microscopic scale of neuronal networks through a wide set of network conditions and network structures. In this paper, we study the performance of the Ising model couplings to infer the synaptic connectivity in in silico networks of neurons and compare its performance against partial and cross-correlations for different correlation levels, firing rates, network sizes, network densities, and topologies. Our results show that the relative performance amongst the three functional connectivity metrics depends primarily on the network correlation levels. Ising couplings detected the most structural links at very weak network correlation levels, and partial correlations outperformed Ising couplings and cross-correlations at strong correlation levels. The result was consistent across varying firing rates, network sizes, and topologies. The findings of this paper serve as a guide in choosing the right functional connectivity tool to reconstruct the structural connectivity.
Vilanova, Pedro
2016-01-07
In this work, we present an extension of the forward-reverse representation introduced in Simulation of forward-reverse stochastic representations for conditional diffusions , a 2014 paper by Bayer and Schoenmakers to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, i.e., SRNs conditional on their values in the extremes of given time-intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the Expectation-Maximization algorithm to the phase I output. By selecting a set of over-dispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Bayer, Christian
2016-02-20
© 2016 Taylor & Francis Group, LLC. ABSTRACT: In this work, we present an extension of the forward–reverse representation introduced by Bayer and Schoenmakers (Annals of Applied Probability, 24(5):1994–2032, 2014) to the context of stochastic reaction networks (SRNs). We apply this stochastic representation to the computation of efficient approximations of expected values of functionals of SRN bridges, that is, SRNs conditional on their values in the extremes of given time intervals. We then employ this SRN bridge-generation technique to the statistical inference problem of approximating reaction propensities based on discretely observed data. To this end, we introduce a two-phase iterative inference method in which, during phase I, we solve a set of deterministic optimization problems where the SRNs are replaced by their reaction-rate ordinary differential equations approximation; then, during phase II, we apply the Monte Carlo version of the expectation-maximization algorithm to the phase I output. By selecting a set of overdispersed seeds as initial points in phase I, the output of parallel runs from our two-phase method is a cluster of approximate maximum likelihood estimates. Our results are supported by numerical examples.
Inferring Neuronal Network Connectivity from Spike Data: A Temporal Data Mining Approach
Directory of Open Access Journals (Sweden)
Debprakash Patnaik
2008-01-01
Full Text Available Understanding the functioning of a neural system in terms of its underlying circuitry is an important problem in neuroscience. Recent developments in electrophysiology and imaging allow one to simultaneously record activities of hundreds of neurons. Inferring the underlying neuronal connectivity patterns from such multi-neuronal spike train data streams is a challenging statistical and computational problem. This task involves finding significant temporal patterns from vast amounts of symbolic time series data. In this paper we show that the frequent episode mining methods from the field of temporal data mining can be very useful in this context. In the frequent episode discovery framework, the data is viewed as a sequence of events, each of which is characterized by an event type and its time of occurrence and episodes are certain types of temporal patterns in such data. Here we show that, using the set of discovered frequent episodes from multi-neuronal data, one can infer different types of connectivity patterns in the neural system that generated it. For this purpose, we introduce the notion of mining for frequent episodes under certain temporal constraints; the structure of these temporal constraints is motivated by the application. We present algorithms for discovering serial and parallel episodes under these temporal constraints. Through extensive simulation studies we demonstrate that these methods are useful for unearthing patterns of neuronal network connectivity.
Directory of Open Access Journals (Sweden)
Guocai Chen
2014-06-01
Full Text Available Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions. The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a
Christensen, Claire Petra
's own publications have contributed network inference, simulation, modeling, and analysis methods to the much larger body of work in systems biology, and indeed, in network science. The aim of this thesis is therefore twofold: to present this original work in the historical context of network science, but also to provide sufficient review and reference regarding complex systems (with an emphasis on complex networks in systems biology) and tools and techniques for their inference, simulation, analysis, and modeling, such that the reader will be comfortable in seeking out further information on the subject. The review-like Chapters 1, 2, and 4 are intended to convey the co-evolution of network science and the slow but noticeable breakdown of boundaries between disciplines in academia as research and comparison of diverse systems has brought to light the shared properties of these systems. It is the author's hope that theses chapters impart some sense of the remarkable and rapid progress in complex systems research that has led to this unprecedented academic synergy. Chapters 3 and 5 detail the author's original work in the context of complex systems research. Chapter 3 presents the methods and results of a two-stage modeling process that generates candidate gene-regulatory networks of the bacterium B.subtilis from experimentally obtained, yet mathematically underdetermined microchip array data. These networks are then analyzed from a graph theoretical perspective, and their biological viability is critiqued by comparing the networks' graph theoretical properties to those of other biological systems. The results of topological perturbation analyses revealing commonalities in behavior at multiple levels of complexity are also presented, and are shown to be an invaluable means by which to ascertain the level of complexity to which the network inference process is robust to noise. Chapter 5 outlines a learning algorithm for the development of a realistic, evolving social
Proximity-Based Trust Inference for Mobile Social Networking
Seyedi A.; Saadi R.; Issarny V.
2011-01-01
Part 3: Short Papers; International audience; The growing trend to social networking and increased preva- lence of new mobile devices lead to the emergence of mobile social networking applications where users are able to share experience in an impromptu way as they move. However, this is at risk for mobile users since they may not have any knowledge about the users they socially connect with. Trust management then appears as a promising decision support for mobile users in establishing social...
Nitrate leaching from a potato field using adaptive network-based fuzzy inference system
DEFF Research Database (Denmark)
Shekofteh, Hosein; Afyuni, Majid M; Hajabbasi, Mohammad-Ali
2013-01-01
The conventional methods of application of nitrogen fertilizers might be responsible for the increased nitrate concentration in groundwater of areas dominated by irrigated agriculture. Appropriate water and nutrient management strategies are required to minimize groundwater pollution...... of nitrate (NO3) leaching from a potato field under a drip fertigation system. In the first part of the study, a two-dimensional solute transport model was used to simulate nitrate leaching from a sandy soil with varying emitter discharge rates and fertilizer doses. The results from the modeling were used...... to train and validate an adaptive network-based fuzzy inference system (ANFIS) in order to estimate nitrate leaching. Two performance functions, namely mean absolute percentage error (MAPE) and correlation coefficient (R), were used to evaluate the adequacy of the ANFIS. Results showed that ANFIS can...
Inferring Plasmodium vivax transmission networks from tempo-spatial surveillance data.
Shi, Benyun; Liu, Jiming; Zhou, Xiao-Nong; Yang, Guo-Jing
2014-02-01
The transmission networks of Plasmodium vivax characterize how the parasite transmits from one location to another, which are informative and insightful for public health policy makers to accurately predict the patterns of its geographical spread. However, such networks are not apparent from surveillance data because P. vivax transmission can be affected by many factors, such as the biological characteristics of mosquitoes and the mobility of human beings. Here, we pay special attention to the problem of how to infer the underlying transmission networks of P. vivax based on available tempo-spatial patterns of reported cases. We first define a spatial transmission model, which involves representing both the heterogeneous transmission potential of P. vivax at individual locations and the mobility of infected populations among different locations. Based on the proposed transmission model, we further introduce a recurrent neural network model to infer the transmission networks from surveillance data. Specifically, in this model, we take into account multiple real-world factors, including the length of P. vivax incubation period, the impact of malaria control at different locations, and the total number of imported cases. We implement our proposed models by focusing on the P. vivax transmission among 62 towns in Yunnan province, People's Republic China, which have been experiencing high malaria transmission in the past years. By conducting scenario analysis with respect to different numbers of imported cases, we can (i) infer the underlying P. vivax transmission networks, (ii) estimate the number of imported cases for each individual town, and (iii) quantify the roles of individual towns in the geographical spread of P. vivax. The demonstrated models have presented a general means for inferring the underlying transmission networks from surveillance data. The inferred networks will offer new insights into how to improve the predictability of P. vivax transmission.
INCITE: Edge-based Traffic Processing and Inference for High-Performance Networks
Energy Technology Data Exchange (ETDEWEB)
Baraniuk, Richard G.; Feng, Wu-chun; Cottrell, Les; Knightly, Edward; Nowak, Robert; Riedi, Rolf
2005-06-20
The INCITE (InterNet Control and Inference Tools at the Edge) Project developed on-line tools to characterize and map host and network performance as a function of space, time, application, protocol, and service. In addition to their utility for trouble-shooting problems, these tools will enable a new breed of applications and operating systems that are network aware and resource aware. Launching from the foundation provided our recent leading-edge research on network measurement, multifractal signal analysis, multiscale random fields, and quality of service, our effort consisted of three closely integrated research thrusts that directly attack several key networking challenges of DOE's SciDAC program. These are: Thrust 1, Multiscale traffic analysis and modeling techniques; Thrust 2, Inference and control algorithms for network paths, links, and routers, and Thrust 3, Data collection tools.
Measuring the wisdom of the crowds in network-based gene function inference.
Verleyen, W; Ballouz, S; Gillis, J
2015-03-01
Network-based gene function inference methods have proliferated in recent years, but measurable progress remains elusive. We wished to better explore performance trends by controlling data and algorithm implementation, with a particular focus on the performance of aggregate predictions. Hypothesizing that popular methods would perform well without hand-tuning, we used well-characterized algorithms to produce verifiably 'untweaked' results. We find that most state-of-the-art machine learning methods obtain 'gold standard' performance as measured in critical assessments in defined tasks. Across a broad range of tests, we see close alignment in algorithm performances after controlling for the underlying data being used. We find that algorithm aggregation provides only modest benefits, with a 17% increase in area under the ROC (AUROC) above the mean AUROC. In contrast, data aggregation gains are enormous with an 88% improvement in mean AUROC. Altogether, we find substantial evidence to support the view that additional algorithm development has little to offer for gene function prediction. The supplementary information contains a description of the algorithms, the network data parsed from different biological data resources and a guide to the source code (available at: http://gillislab.cshl.edu/supplements/). © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Recognizing recurrent neural networks (rRNN): Bayesian inference for recurrent neural networks.
Bitzer, Sebastian; Kiebel, Stefan J
2012-07-01
Recurrent neural networks (RNNs) are widely used in computational neuroscience and machine learning applications. In an RNN, each neuron computes its output as a nonlinear function of its integrated input. While the importance of RNNs, especially as models of brain processing, is undisputed, it is also widely acknowledged that the computations in standard RNN models may be an over-simplification of what real neuronal networks compute. Here, we suggest that the RNN approach may be made computationally more powerful by its fusion with Bayesian inference techniques for nonlinear dynamical systems. In this scheme, we use an RNN as a generative model of dynamic input caused by the environment, e.g. of speech or kinematics. Given this generative RNN model, we derive Bayesian update equations that can decode its output. Critically, these updates define a 'recognizing RNN' (rRNN), in which neurons compute and exchange prediction and prediction error messages. The rRNN has several desirable features that a conventional RNN does not have, e.g. fast decoding of dynamic stimuli and robustness to initial conditions and noise. Furthermore, it implements a predictive coding scheme for dynamic inputs. We suggest that the Bayesian inversion of RNNs may be useful both as a model of brain function and as a machine learning tool. We illustrate the use of the rRNN by an application to the online decoding (i.e. recognition) of human kinematics.
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Directory of Open Access Journals (Sweden)
MONIQUE S. FERREIRA
Full Text Available Considering the importance of monitoring the water quality parameters, remote sensing is a practicable alternative to limnological variables detection, which interacts with electromagnetic radiation, called optically active components (OAC. Among these, the phytoplankton pigment chlorophyll a is the most representative pigment of photosynthetic activity in all classes of algae. In this sense, this work aims to develop a method of spatial inference of chlorophyll a concentration using Artificial Neural Networks (ANN. To achieve this purpose, a multispectral image and fluorometric measurements were used as input data. The multispectral image was processed and the net training and validation dataset were carefully chosen. From this, the neural net architecture and its parameters were defined to model the variable of interest. In the end of training phase, the trained network was applied to the image and a qualitative analysis was done. Thus, it was noticed that the integration of fluorometric and multispectral data provided good results in the chlorophyll a inference, when combined in a structure of artificial neural networks.Considerando a importância do monitoramento de parâmetros da qualidade da água, o sensoriamento remoto é uma alternativa viável à detecção de variáveis limnológicas que possuem propriedades de interação com a radiação eletromagnética, chamadas componentes opticamente ativos (COA. Dentre esses, cita-se a clorofila a, que é o pigmento mais representativo da atividade fotossintética em todas as classes de algas. Nesse sentido, o presente trabalho se propôs a desenvolver um método de inferê;ncia espacial da concentração de clorofila a utilizando Redes Neurais Artificiais (RNA. Para atingir tal objetivo, foi utilizada uma imagem multiespectral e medidas fluorimétricas como dados de entrada. A imagem multiespectral foi tratada, os dados de treinamento e validação da rede foram cuidadosamente selecionados e
Huang, Wei; Oh, Sung-Kwun; Pedrycz, Witold
2017-08-11
This paper presents a hybrid fuzzy wavelet neural network (HFWNN) realized with the aid of polynomial neural networks (PNNs) and fuzzy inference-based wavelet neurons (FIWNs). Two types of FIWNs including fuzzy set inference-based wavelet neurons (FSIWNs) and fuzzy relation inference-based wavelet neurons (FRIWNs) are proposed. In particular, a FIWN without any fuzzy set component (viz., a premise part of fuzzy rule) becomes a wavelet neuron (WN). To alleviate the limitations of the conventional wavelet neural networks or fuzzy wavelet neural networks whose parameters are determined based on a purely random basis, the parameters of wavelet functions standing in FIWNs or WNs are initialized by using the C-Means clustering method. The overall architecture of the HFWNN is similar to the one of the typical PNNs. The main strategies in the design of HFWNN are developed as follows. First, the first layer of the network consists of FIWNs (e.g., FSIWN or FRIWN) that are used to reflect the uncertainty of data, while the second and higher layers consist of WNs, which exhibit a high level of flexibility and realize a linear combination of wavelet functions. Second, the parameters used in the design of the HFWNN are adjusted through genetic optimization. To evaluate the performance of the proposed HFWNN, several publicly available data are considered. Furthermore a thorough comparative analysis is covered.
Statistical Inferences from the Topology of Complex Networks
2016-10-04
included elsewhere such as: prepared in cooperation with; translation of; report supersedes; old edition number, etc. 14. ABSTRACT. A brief...Kang Jeng, and Yi-Hsuan Yang. Applying topological persistence in convolutional neural network for music audio signals. 08 2016, 1608.07373. 6
The causal inference of cortical neural networks during music improvisations.
Wan, Xiaogeng; Crüts, Björn; Jensen, Henrik Jeldtoft
2014-01-01
We present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music) or in improvisation. Each piece of music was performed in two different modes: strict mode and "let-go" mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional Mutual Information from Mixed Embedding (MIME), to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and "let-go" mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found a difference between the response of musicians and the listeners when comparing the different playing conditions.
Inferring a transcription regulatory network by directed perturbation
Sameith, K.
2013-01-01
Transcription plays a key role in cellular processes and its regulation is of paramount importance. The aim of the work described in this thesis is to study the transcription regulatory network of Saccharomyces cerevisiae, employing genome-wide approaches. All the three presented research studies
Network Inference and Maximum Entropy Estimation on Information Diagrams
Czech Academy of Sciences Publication Activity Database
Martin, E.A.; Hlinka, J.; Meinke, A.; Děchtěrenko, Filip; Tintěra, J.; Oliver, I.; Davidsen, J.
2017-01-01
Roč. 7, č. 1 (2017), s. 1-15, č. článku 7062. ISSN 2045-2322 R&D Projects: GA ČR GA13-23940S Institutional support: RVO:68081740 Keywords : complex networks * mutual information * entropy maximization * fMRI Subject RIV: AN - Psychology Impact factor: 4.259, year: 2016
The causal inference of cortical neural networks during music improvisations.
Directory of Open Access Journals (Sweden)
Xiaogeng Wan
Full Text Available We present an EEG study of two music improvisation experiments. Professional musicians with high level of improvisation skills were asked to perform music either according to notes (composed music or in improvisation. Each piece of music was performed in two different modes: strict mode and "let-go" mode. Synchronized EEG data was measured from both musicians and listeners. We used one of the most reliable causality measures: conditional Mutual Information from Mixed Embedding (MIME, to analyze directed correlations between different EEG channels, which was combined with network theory to construct both intra-brain and cross-brain networks. Differences were identified in intra-brain neural networks between composed music and improvisation and between strict mode and "let-go" mode. Particular brain regions such as frontal, parietal and temporal regions were found to play a key role in differentiating the brain activities between different playing conditions. By comparing the level of degree centralities in intra-brain neural networks, we found a difference between the response of musicians and the listeners when comparing the different playing conditions.
Unifying Inference of Meso-Scale Structures in Networks.
Directory of Open Access Journals (Sweden)
Birkan Tunç
Full Text Available Networks are among the most prevalent formal representations in scientific studies, employed to depict interactions between objects such as molecules, neuronal clusters, or social groups. Studies performed at meso-scale that involve grouping of objects based on their distinctive interaction patterns form one of the main lines of investigation in network science. In a social network, for instance, meso-scale structures can correspond to isolated social groupings or groups of individuals that serve as a communication core. Currently, the research on different meso-scale structures such as community and core-periphery structures has been conducted via independent approaches, which precludes the possibility of an algorithmic design that can handle multiple meso-scale structures and deciding which structure explains the observed data better. In this study, we propose a unified formulation for the algorithmic detection and analysis of different meso-scale structures. This facilitates the investigation of hybrid structures that capture the interplay between multiple meso-scale structures and statistical comparison of competing structures, all of which have been hitherto unavailable. We demonstrate the applicability of the methodology in analyzing the human brain network, by determining the dominant organizational structure (communities of the brain, as well as its auxiliary characteristics (core-periphery.
Unifying Inference of Meso-Scale Structures in Networks.
Tunç, Birkan; Verma, Ragini
2015-01-01
Networks are among the most prevalent formal representations in scientific studies, employed to depict interactions between objects such as molecules, neuronal clusters, or social groups. Studies performed at meso-scale that involve grouping of objects based on their distinctive interaction patterns form one of the main lines of investigation in network science. In a social network, for instance, meso-scale structures can correspond to isolated social groupings or groups of individuals that serve as a communication core. Currently, the research on different meso-scale structures such as community and core-periphery structures has been conducted via independent approaches, which precludes the possibility of an algorithmic design that can handle multiple meso-scale structures and deciding which structure explains the observed data better. In this study, we propose a unified formulation for the algorithmic detection and analysis of different meso-scale structures. This facilitates the investigation of hybrid structures that capture the interplay between multiple meso-scale structures and statistical comparison of competing structures, all of which have been hitherto unavailable. We demonstrate the applicability of the methodology in analyzing the human brain network, by determining the dominant organizational structure (communities) of the brain, as well as its auxiliary characteristics (core-periphery).
Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien
2013-01-01
BACKGROUND: An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. METHODS: THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED B...
Statistical Inference for Detecting Structures and Anomalies in Networks
2015-08-27
5 DISTRIBUTION A: Distribution approved for public release asymptotically optimal accuracy, and the second is a fast spectral clustering algorithm...from a null model where the network is rewired randomly. However, maximizing the modularity can lead to many competing partitions , which have almost the...same modularity but which have little in common with each other; it can also overfit, producing illusory “communities” in random graphs where none
Network Inference and Maximum Entropy Estimation on Information Diagrams
Czech Academy of Sciences Publication Activity Database
Martin, E.A.; Hlinka, Jaroslav; Meinke, A.; Děchtěrenko, Filip; Tintěra, J.; Oliver, I.; Davidsen, J.
2017-01-01
Roč. 7, č. 1 (2017), č. článku 7062. ISSN 2045-2322 R&D Projects: GA ČR GA13-23940S; GA MZd(CZ) NV15-29835A Grant - others:GA MŠk(CZ) LO1611 Institutional support: RVO:67985807 Keywords : complex networks * mutual information * entropy maximization * fMRI Subject RIV: BD - Theory of Information Impact factor: 4.259, year: 2016
Pecevski, Dejan; Buesing, Lars; Maass, Wolfgang
2011-01-01
An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows (“explaining away”) and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons. PMID:22219717
Directory of Open Access Journals (Sweden)
Dejan Pecevski
2011-12-01
Full Text Available An important open problem of computational neuroscience is the generic organization of computations in networks of neurons in the brain. We show here through rigorous theoretical analysis that inherent stochastic features of spiking neurons, in combination with simple nonlinear computational operations in specific network motifs and dendritic arbors, enable networks of spiking neurons to carry out probabilistic inference through sampling in general graphical models. In particular, it enables them to carry out probabilistic inference in Bayesian networks with converging arrows ("explaining away" and with undirected loops, that occur in many real-world tasks. Ubiquitous stochastic features of networks of spiking neurons, such as trial-to-trial variability and spontaneous activity, are necessary ingredients of the underlying computational organization. We demonstrate through computer simulations that this approach can be scaled up to neural emulations of probabilistic inference in fairly large graphical models, yielding some of the most complex computations that have been carried out so far in networks of spiking neurons.
Inferring cultural regions from correlation networks of given baby names
Pomorski, Mateusz; Kulakowski, Krzysztof; Kwapien, Jaroslaw; Ausloos, Marcel
2016-01-01
We report investigations on the statistical characteristics of the baby names given between 1910 and 2010 in the United States of America. For each year, the 100 most frequent names in the USA are sorted out. For these names, the correlations between the names profiles are calculated for all pairs of states (minus Hawaii and Alaska). The correlations are used to form a weighted network which is found to vary mildly in time. In fact, the structure of communities in the network remains quite stable till about 1980. The goal is that the calculated structure approximately reproduces the usually accepted geopolitical regions: the North East, the South, and the "Midwest + West" as the third one. Furthermore, the dataset reveals that the name distribution satisfies the Zipf law, separately for each state and each year, i.e. the name frequency $f\\propto r^{-\\alpha}$, where r is the name rank. Between 1920 and 1980, the exponent alpha is the largest one for the set of states classified as 'the South', but the smallest...
Cabral, Mariza Castanheira De Moura Da Costa
In the fifty-two years since Robert Horton's 1945 pioneering quantitative description of channel network planform (or plan view morphology), no conclusive findings have been presented that permit inference of geomorphological processes from any measures of network planform. All measures of network planform studied exhibit limited geographic variability across different environments. Horton (1945), Langbein et al. (1947), Schumm (1956), Hack (1957), Melton (1958), and Gray (1961) established various "laws" of network planform, that is, statistical relationships between different variables which have limited variability. A wide variety of models which have been proposed to simulate the growth of channel networks in time over a landsurface are generally also in agreement with the above planform laws. An explanation is proposed for the generality of the channel network planform laws. Channel networks must be space filling, that is, they must extend over the landscape to drain every hillslope, leaving no large undrained areas, and with no crossing of channels, often achieving a roughly uniform drainage density in a given environment. It is shown that the space-filling constraint can reduce the sensitivity of planform variables to different network growth models, and it is proposed that this constraint may determine the planform laws. The "Q model" of network growth of Van Pelt and Verwer (1985) is used to generate samples of networks. Sensitivity to the model parameter Q is markedly reduced when the networks generated are required to be space filling. For a wide variety of Q values, the space-filling networks are in approximate agreement with the various channel network planform laws. Additional constraints, including of energy efficiency, were not studied but may further reduce the variability of planform laws. Inference of model parameter Q from network topology is successful only in networks not subject to spatial constraints. In space-filling networks, for a wide
Statistical Inference Methods for Sparse Biological Time Series Data
Directory of Open Access Journals (Sweden)
Voit Eberhard O
2011-04-01
Full Text Available Abstract Background Comparing metabolic profiles under different biological perturbations has become a powerful approach to investigating the functioning of cells. The profiles can be taken as single snapshots of a system, but more information is gained if they are measured longitudinally over time. The results are short time series consisting of relatively sparse data that cannot be analyzed effectively with standard time series techniques, such as autocorrelation and frequency domain methods. In this work, we study longitudinal time series profiles of glucose consumption in the yeast Saccharomyces cerevisiae under different temperatures and preconditioning regimens, which we obtained with methods of in vivo nuclear magnetic resonance (NMR spectroscopy. For the statistical analysis we first fit several nonlinear mixed effect regression models to the longitudinal profiles and then used an ANOVA likelihood ratio method in order to test for significant differences between the profiles. Results The proposed methods are capable of distinguishing metabolic time trends resulting from different treatments and associate significance levels to these differences. Among several nonlinear mixed-effects regression models tested, a three-parameter logistic function represents the data with highest accuracy. ANOVA and likelihood ratio tests suggest that there are significant differences between the glucose consumption rate profiles for cells that had been--or had not been--preconditioned by heat during growth. Furthermore, pair-wise t-tests reveal significant differences in the longitudinal profiles for glucose consumption rates between optimal conditions and heat stress, optimal and recovery conditions, and heat stress and recovery conditions (p-values Conclusion We have developed a nonlinear mixed effects model that is appropriate for the analysis of sparse metabolic and physiological time profiles. The model permits sound statistical inference procedures
Energy Technology Data Exchange (ETDEWEB)
Sadeh, Javad; Afradi, Hamid [Electrical Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, P.O. Box: 91775-1111, Mashhad (Iran)
2009-11-15
This paper presents a new and accurate algorithm for locating faults in a combined overhead transmission line with underground power cable using Adaptive Network-Based Fuzzy Inference System (ANFIS). The proposed method uses 10 ANFIS networks and consists of 3 stages, including fault type classification, faulty section detection and exact fault location. In the first part, an ANFIS is used to determine the fault type, applying four inputs, i.e., fundamental component of three phase currents and zero sequence current. Another ANFIS network is used to detect the faulty section, whether the fault is on the overhead line or on the underground cable. Other eight ANFIS networks are utilized to pinpoint the faults (two for each fault type). Four inputs, i.e., the dc component of the current, fundamental frequency of the voltage and current and the angle between them, are used to train the neuro-fuzzy inference systems in order to accurately locate the faults on each part of the combined line. The proposed method is evaluated under different fault conditions such as different fault locations, different fault inception angles and different fault resistances. Simulation results confirm that the proposed method can be used as an efficient means for accurate fault location on the combined transmission lines. (author)
DEFF Research Database (Denmark)
Lopes, Miguel; Kutlu, Burak; Miani, Michela
2014-01-01
Type 1 Diabetes (T1D) is an autoimmune disease where local release of cytokines such as IL-1β and IFN-γ contributes to β-cell apoptosis. To identify relevant genes regulating this process we performed a meta-analysis of 8 datasets of β-cell gene expression after exposure to IL-1β and IFN-γ. Two...... of these datasets are novel and contain time-series expressions in human islet cells and rat INS-1E cells. Genes were ranked according to their differential expression within and after 24 h from exposure, and characterized by function and prior knowledge in the literature. A regulatory network was then inferred...... from the human time expression datasets, using a time-series extension of a network inference method. The two most differentially expressed genes previously unknown in T1D literature (RIPK2 and ELF3) were found to modulate cytokine-induced apoptosis. The inferred regulatory network is thus supported...
Parallel mutual information estimation for inferring gene regulatory networks on GPUs
Directory of Open Access Journals (Sweden)
Liu Weiguo
2011-06-01
Full Text Available Abstract Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.
Inference of the oxidative stress network in Anopheles stephensi upon Plasmodium infection.
Directory of Open Access Journals (Sweden)
Jatin Shrinet
Full Text Available Ookinete invasion of Anopheles midgut is a critical step for malaria transmission; the parasite numbers drop drastically and practically reach a minimum during the parasite's whole life cycle. At this stage, the parasite as well as the vector undergoes immense oxidative stress. Thereafter, the vector undergoes oxidative stress at different time points as the parasite invades its tissues during the parasite development. The present study was undertaken to reconstruct the network of differentially expressed genes involved in oxidative stress in Anopheles stephensi during Plasmodium development and maturation in the midgut. Using high throughput next generation sequencing methods, we generated the transcriptome of the An. stephensi midgut during Plasmodium vinckei petteri oocyst invasion of the midgut epithelium. Further, we utilized large datasets available on public domain on Anopheles during Plasmodium ookinete invasion and Drosophila datasets and arrived upon clusters of genes that may play a role in oxidative stress. Finally, we used support vector machines for the functional prediction of the un-annotated genes of An. stephensi. Integrating the results from all the different data analyses, we identified a total of 516 genes that were involved in oxidative stress in An. stephensi during Plasmodium development. The significantly regulated genes were further extracted from this gene cluster and used to infer an oxidative stress network of An. stephensi. Using system biology approaches, we have been able to ascertain the role of several putative genes in An. stephensi with respect to oxidative stress. Further experimental validations of these genes are underway.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints.
Shahdoust, Maryam; Pezeshk, Hamid; Mahjub, Hossein; Sadeghi, Mehdi
2017-01-01
The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches.
Inferring combinatorial association logic networks in multimodal genome-wide screens.
de Ridder, Jeroen; Gerrits, Alice; Bot, Jan; de Haan, Gerald; Reinders, Marcel; Wessels, Lodewyk
2010-06-15
We propose an efficient method to infer combinatorial association logic networks from multiple genome-wide measurements from the same sample. We demonstrate our method on a genetical genomics dataset, in which we search for Boolean combinations of multiple genetic loci that associate with transcript levels. Our method provably finds the global solution and is very efficient with runtimes of up to four orders of magnitude faster than the exhaustive search. This enables permutation procedures for determining accurate false positive rates and allows selection of the most parsimonious model. When applied to transcript levels measured in myeloid cells from 24 genotyped recombinant inbred mouse strains, we discovered that nine gene clusters are putatively modulated by a logical combination of trait loci rather than a single locus. A literature survey supports and further elucidates one of these findings. Due to our approach, optimal solutions for multi-locus logic models and accurate estimates of the associated false discovery rates become feasible. Our algorithm, therefore, offers a valuable alternative to approaches employing complex, albeit suboptimal optimization strategies to identify complex models. The MATLAB code of the prototype implementation is available on: http://bioinformatics.tudelft.nl/ or http://bioinformatics.nki.nl/.
Zhang, Chaoyang; Chen, Yang; Hu, Gang
2017-08-01
Most complex social, biological and technological systems can be described by dynamic networks. Reconstructing network structures from measurable data is a fundamental problem in almost all interdisciplinary fields. Network nodes interact to each other, therefore, the accurate reconstruction of any interaction to a node requires data measurements of all its neighboring nodes. When networks are large, these data are often unavailable and thus network inference turns to be difficult. Here, we propose a method to use fast-varying noise driving (FVND) to enhance targeted interactions. With applications of noise driving we can infer any interaction from a driving node to a driven node with known data of these two nodes only while all other nodes are hidden, though the driven node may be actually driven by a large number of hidden nodes. Analytical derivation of the FVND method is conducted and numerical simulations perfectly justify the theoretical derivation.
Inferring meaningful communities from topology-constrained correlation networks.
Hleap, Jose Sergio; Blouin, Christian
2014-01-01
Community structure detection is an important tool in graph analysis. This can be done, among other ways, by solving for the partition set which optimizes the modularity scores [Formula: see text]. Here it is shown that topological constraints in correlation graphs induce over-fragmentation of community structures. A refinement step to this optimization based on Linear Discriminant Analysis (LDA) and a statistical test for significance is proposed. In structured simulation constrained by topology, this novel approach performs better than the optimization of modularity alone. This method was also tested with two empirical datasets: the Roll-Call voting in the 110th US Senate constrained by geographic adjacency, and a biological dataset of 135 protein structures constrained by inter-residue contacts. The former dataset showed sub-structures in the communities that revealed a regional bias in the votes which transcend party affiliations. This is an interesting pattern given that the 110th Legislature was assumed to be a highly polarized government. The [Formula: see text]-amylase catalytic domain dataset (biological dataset) was analyzed with and without topological constraints (inter-residue contacts). The results without topological constraints showed differences with the topology constrained one, but the LDA filtering did not change the outcome of the latter. This suggests that the LDA filtering is a robust way to solve the possible over-fragmentation when present, and that this method will not affect the results where there is no evidence of over-fragmentation.
Inferring meaningful communities from topology-constrained correlation networks.
Directory of Open Access Journals (Sweden)
Jose Sergio Hleap
Full Text Available Community structure detection is an important tool in graph analysis. This can be done, among other ways, by solving for the partition set which optimizes the modularity scores [Formula: see text]. Here it is shown that topological constraints in correlation graphs induce over-fragmentation of community structures. A refinement step to this optimization based on Linear Discriminant Analysis (LDA and a statistical test for significance is proposed. In structured simulation constrained by topology, this novel approach performs better than the optimization of modularity alone. This method was also tested with two empirical datasets: the Roll-Call voting in the 110th US Senate constrained by geographic adjacency, and a biological dataset of 135 protein structures constrained by inter-residue contacts. The former dataset showed sub-structures in the communities that revealed a regional bias in the votes which transcend party affiliations. This is an interesting pattern given that the 110th Legislature was assumed to be a highly polarized government. The [Formula: see text]-amylase catalytic domain dataset (biological dataset was analyzed with and without topological constraints (inter-residue contacts. The results without topological constraints showed differences with the topology constrained one, but the LDA filtering did not change the outcome of the latter. This suggests that the LDA filtering is a robust way to solve the possible over-fragmentation when present, and that this method will not affect the results where there is no evidence of over-fragmentation.
Golightly, Andrew; Wilkinson, Darren J
2011-12-06
Computational systems biology is concerned with the development of detailed mechanistic models of biological processes. Such models are often stochastic and analytically intractable, containing uncertain parameters that must be estimated from time course data. In this article, we consider the task of inferring the parameters of a stochastic kinetic model defined as a Markov (jump) process. Inference for the parameters of complex nonlinear multivariate stochastic process models is a challenging problem, but we find here that algorithms based on particle Markov chain Monte Carlo turn out to be a very effective computationally intensive approach to the problem. Approximations to the inferential model based on stochastic differential equations (SDEs) are considered, as well as improvements to the inference scheme that exploit the SDE structure. We apply the methodology to a Lotka-Volterra system and a prokaryotic auto-regulatory network.
Directory of Open Access Journals (Sweden)
Steffen Sass
2015-12-01
Full Text Available MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method “miRlastic”, which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional
Directory of Open Access Journals (Sweden)
Dong Ling Tong
Full Text Available OBJECTIVE: To model the potential interaction between previously identified biomarkers in children sarcomas using artificial neural network inference (ANNI. METHOD: To concisely demonstrate the biological interactions between correlated genes in an interaction network map, only 2 types of sarcomas in the children small round blue cell tumors (SRBCTs dataset are discussed in this paper. A backpropagation neural network was used to model the potential interaction between genes. The prediction weights and signal directions were used to model the strengths of the interaction signals and the direction of the interaction link between genes. The ANN model was validated using Monte Carlo cross-validation to minimize the risk of over-fitting and to optimize generalization ability of the model. RESULTS: Strong connection links on certain genes (TNNT1 and FNDC5 in rhabdomyosarcoma (RMS; FCGRT and OLFM1 in Ewing's sarcoma (EWS suggested their potency as central hubs in the interconnection of genes with different functionalities. The results showed that the RMS patients in this dataset are likely to be congenital and at low risk of cardiomyopathy development. The EWS patients are likely to be complicated by EWS-FLI fusion and deficiency in various signaling pathways, including Wnt, Fas/Rho and intracellular oxygen. CONCLUSIONS: The ANN network inference approach and the examination of identified genes in the published literature within the context of the disease highlights the substantial influence of certain genes in sarcomas.
Inferring hidden states in Langevin dynamics on large networks: Average case performance
Bravi, B.; Opper, M.; Sollich, P.
2017-01-01
We present average performance results for dynamical inference problems in large networks, where a set of nodes is hidden while the time trajectories of the others are observed. Examples of this scenario can occur in signal transduction and gene regulation networks. We focus on the linear stochastic dynamics of continuous variables interacting via random Gaussian couplings of generic symmetry. We analyze the inference error, given by the variance of the posterior distribution over hidden paths, in the thermodynamic limit and as a function of the system parameters and the ratio α between the number of hidden and observed nodes. By applying Kalman filter recursions we find that the posterior dynamics is governed by an "effective" drift that incorporates the effect of the observations. We present two approaches for characterizing the posterior variance that allow us to tackle, respectively, equilibrium and nonequilibrium dynamics. The first appeals to Random Matrix Theory and reveals average spectral properties of the inference error and typical posterior relaxation times; the second is based on dynamical functionals and yields the inference error as the solution of an algebraic equation.
AF-DHNN: Fuzzy Clustering and Inference-Based Node Fault Diagnosis Method for Fire Detection.
Jin, Shan; Cui, Wen; Jin, Zhigang; Wang, Ying
2015-07-17
Wireless Sensor Networks (WSNs) have been utilized for node fault diagnosis in the fire detection field since the 1990s. However, the traditional methods have some problems, including complicated system structures, intensive computation needs, unsteady data detection and local minimum values. In this paper, a new diagnosis mechanism for WSN nodes is proposed, which is based on fuzzy theory and an Adaptive Fuzzy Discrete Hopfield Neural Network (AF-DHNN). First, the original status of each sensor over time is obtained with two features. One is the root mean square of the filtered signal (FRMS), the other is the normalized summation of the positive amplitudes of the difference spectrum between the measured signal and the healthy one (NSDS). Secondly, distributed fuzzy inference is introduced. The evident abnormal nodes' status is pre-alarmed to save time. Thirdly, according to the dimensions of the diagnostic data, an adaptive diagnostic status system is established with a Fuzzy C-Means Algorithm (FCMA) and Sorting and Classification Algorithm to reducing the complexity of the fault determination. Fourthly, a Discrete Hopfield Neural Network (DHNN) with iterations is improved with the optimization of the sensors' detected status information and standard diagnostic levels, with which the associative memory is achieved, and the search efficiency is improved. The experimental results show that the AF-DHNN method can diagnose abnormal WSN node faults promptly and effectively, which improves the WSN reliability.
AF-DHNN: Fuzzy Clustering and Inference-Based Node Fault Diagnosis Method for Fire Detection
Directory of Open Access Journals (Sweden)
Shan Jin
2015-07-01
Full Text Available Wireless Sensor Networks (WSNs have been utilized for node fault diagnosis in the fire detection field since the 1990s. However, the traditional methods have some problems, including complicated system structures, intensive computation needs, unsteady data detection and local minimum values. In this paper, a new diagnosis mechanism for WSN nodes is proposed, which is based on fuzzy theory and an Adaptive Fuzzy Discrete Hopfield Neural Network (AF-DHNN. First, the original status of each sensor over time is obtained with two features. One is the root mean square of the filtered signal (FRMS, the other is the normalized summation of the positive amplitudes of the difference spectrum between the measured signal and the healthy one (NSDS. Secondly, distributed fuzzy inference is introduced. The evident abnormal nodes’ status is pre-alarmed to save time. Thirdly, according to the dimensions of the diagnostic data, an adaptive diagnostic status system is established with a Fuzzy C-Means Algorithm (FCMA and Sorting and Classification Algorithm to reducing the complexity of the fault determination. Fourthly, a Discrete Hopfield Neural Network (DHNN with iterations is improved with the optimization of the sensors’ detected status information and standard diagnostic levels, with which the associative memory is achieved, and the search efficiency is improved. The experimental results show that the AF-DHNN method can diagnose abnormal WSN node faults promptly and effectively, which improves the WSN reliability.
Gao, Zhong-Ke; Cai, Qing; Dong, Na; Zhang, Shan-Shan; Bo, Yun; Zhang, Jie
2016-10-01
Distinguishing brain cognitive behavior underlying disabled and able-bodied subjects constitutes a challenging problem of significant importance. Complex network has established itself as a powerful tool for exploring functional brain networks, which sheds light on the inner workings of the human brain. Most existing works in constructing brain network focus on phase-synchronization measures between regional neural activities. In contrast, we propose a novel approach for inferring functional networks from P300 event-related potentials by integrating time and frequency domain information extracted from each channel signal, which we show to be efficient in subsequent pattern recognition. In particular, we construct brain network by regarding each channel signal as a node and determining the edges in terms of correlation of the extracted feature vectors. A six-choice P300 paradigm with six different images is used in testing our new approach, involving one able-bodied subject and three disabled subjects suffering from multiple sclerosis, cerebral palsy, traumatic brain and spinal-cord injury, respectively. We then exploit global efficiency, local efficiency and small-world indices from the derived brain networks to assess the network topological structure associated with different target images. The findings suggest that our method allows identifying brain cognitive behaviors related to visual stimulus between able-bodied and disabled subjects.
Directory of Open Access Journals (Sweden)
Takanori Hasegawa
Full Text Available Comprehensive understanding of gene regulatory networks (GRNs is a major challenge in the field of systems biology. Currently, there are two main approaches in GRN analysis using time-course observation data, namely an ordinary differential equation (ODE-based approach and a statistical model-based approach. The ODE-based approach can generate complex dynamics of GRNs according to biologically validated nonlinear models. However, it cannot be applied to ten or more genes to simultaneously estimate system dynamics and regulatory relationships due to the computational difficulties. The statistical model-based approach uses highly abstract models to simply describe biological systems and to infer relationships among several hundreds of genes from the data. However, the high abstraction generates false regulations that are not permitted biologically. Thus, when dealing with several tens of genes of which the relationships are partially known, a method that can infer regulatory relationships based on a model with low abstraction and that can emulate the dynamics of ODE-based models while incorporating prior knowledge is urgently required. To accomplish this, we propose a method for inference of GRNs using a state space representation of a vector auto-regressive (VAR model with L1 regularization. This method can estimate the dynamic behavior of genes based on linear time-series modeling constructed from an ODE-based model and can infer the regulatory structure among several tens of genes maximizing prediction ability for the observational data. Furthermore, the method is capable of incorporating various types of existing biological knowledge, e.g., drug kinetics and literature-recorded pathways. The effectiveness of the proposed method is shown through a comparison of simulation studies with several previous methods. For an application example, we evaluated mRNA expression profiles over time upon corticosteroid stimulation in rats, thus incorporating
Hasegawa, Takanori; Yamaguchi, Rui; Nagasaki, Masao; Miyano, Satoru; Imoto, Seiya
2014-01-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in the field of systems biology. Currently, there are two main approaches in GRN analysis using time-course observation data, namely an ordinary differential equation (ODE)-based approach and a statistical model-based approach. The ODE-based approach can generate complex dynamics of GRNs according to biologically validated nonlinear models. However, it cannot be applied to ten or more genes to simultaneously estimate system dynamics and regulatory relationships due to the computational difficulties. The statistical model-based approach uses highly abstract models to simply describe biological systems and to infer relationships among several hundreds of genes from the data. However, the high abstraction generates false regulations that are not permitted biologically. Thus, when dealing with several tens of genes of which the relationships are partially known, a method that can infer regulatory relationships based on a model with low abstraction and that can emulate the dynamics of ODE-based models while incorporating prior knowledge is urgently required. To accomplish this, we propose a method for inference of GRNs using a state space representation of a vector auto-regressive (VAR) model with L1 regularization. This method can estimate the dynamic behavior of genes based on linear time-series modeling constructed from an ODE-based model and can infer the regulatory structure among several tens of genes maximizing prediction ability for the observational data. Furthermore, the method is capable of incorporating various types of existing biological knowledge, e.g., drug kinetics and literature-recorded pathways. The effectiveness of the proposed method is shown through a comparison of simulation studies with several previous methods. For an application example, we evaluated mRNA expression profiles over time upon corticosteroid stimulation in rats, thus incorporating corticosteroid
Cross-Dependency Inference in Multi-Layered Networks: A Collaborative Filtering Perspective.
Chen, Chen; Tong, Hanghang; Xie, Lei; Ying, Lei; He, Qing
2017-08-01
The increasingly connected world has catalyzed the fusion of networks from different domains, which facilitates the emergence of a new network model-multi-layered networks. Examples of such kind of network systems include critical infrastructure networks, biological systems, organization-level collaborations, cross-platform e-commerce, and so forth. One crucial structure that distances multi-layered network from other network models is its cross-layer dependency, which describes the associations between the nodes from different layers. Needless to say, the cross-layer dependency in the network plays an essential role in many data mining applications like system robustness analysis and complex network control. However, it remains a daunting task to know the exact dependency relationships due to noise, limited accessibility, and so forth. In this article, we tackle the cross-layer dependency inference problem by modeling it as a collective collaborative filtering problem. Based on this idea, we propose an effective algorithm Fascinate that can reveal unobserved dependencies with linear complexity. Moreover, we derive Fascinate-ZERO, an online variant of Fascinate that can respond to a newly added node timely by checking its neighborhood dependencies. We perform extensive evaluations on real datasets to substantiate the superiority of our proposed approaches.
Directory of Open Access Journals (Sweden)
Savić Marija
2014-01-01
Full Text Available This paper presents the results of the tropospheric ozone concentration modeling as the dependence on volatile organic compounds - VOCs (Benzene, Toluene, m,p-Xylene, o-Xylene, Ethylbenzene; nonorganic compounds - NOx (NO, NO2, NOx, CO, H2S, SO2 and PM10 in the ambient air in parallel with the meteorological parameters: temperature, solar radiation, relative humidity, wind speed and direction. Modeling is based on measured results obtained during the year 2009. The measurements were performed at the measuring station located within an agricultural area, in vicinity of city of Zrenjanin (Serbian Banat, Serbia. Statistical analysis of obtained data, based on bivariate correlation analysis indicated that accurate modeling cannot be performed using linear statistics approach. Also, considering that almost all input variables have wide range of relative change (ratio of variance compared to range, nonlinear statistic analysis method based on only one rule describing the behavior of input variable, most certainly wouldn’t present accurate enough results. From that reason, modeling approach was based on Adaptive-Network-Based Fuzzy Inference System (ANFIS. Model obtained using ANFIS methodology resulted with high accuracy, with prediction potential of above 80%, considering that obtained determination coefficient for the final model was R2=0.802.
An improved algorithm for generalized community structure inference in complex networks
Qu, Yingfei; Shi, Weiren; Shi, Xin
2017-07-01
In recent years, the research of the community detection is not only on the structure that densely connected internally, but also on the structure of more patterns, such as heterogeneity, overlapping, core-periphery. In this paper, we build the network model based on the random graph models and propose an improved algorithm to infer the generalized community structures. We achieve it by introducing the generalized Bernstein polynomials and computing the latent parameters of vertices. The algorithm is tested both on the computer-generated benchmark networks and the real-world networks. Results show that the algorithm makes better performances on convergence speed and is able to discover the latent continuous structures in networks.
Interrogation Methods and Terror Networks
Baccara, Mariagiovanna; Bar-Isaac, Heski
We examine how the structure of terror networks varies with legal limits on interrogation and the ability of authorities to extract information from detainees. We assume that terrorist networks are designed to respond optimally to a tradeoff caused by information exchange: Diffusing information widely leads to greater internal efficiency, but it leaves the organization more vulnerable to law enforcement. The extent of this vulnerability depends on the law enforcement authority’s resources, strategy and interrogation methods. Recognizing that the structure of a terrorist network responds to the policies of law enforcement authorities allows us to begin to explore the most effective policies from the authorities’ point of view.
Multiple network interface core apparatus and method
Underwood, Keith D [Albuquerque, NM; Hemmert, Karl Scott [Albuquerque, NM
2011-04-26
A network interface controller and network interface control method comprising providing a single integrated circuit as a network interface controller and employing a plurality of network interface cores on the single integrated circuit.
Reverse Engineering Cellular Networks with Information Theoretic Methods
Directory of Open Access Journals (Sweden)
Julio R. Banga
2013-05-01
Full Text Available Building mathematical models of cellular networks lies at the core of systems biology. It involves, among other tasks, the reconstruction of the structure of interactions between molecular components, which is known as network inference or reverse engineering. Information theory can help in the goal of extracting as much information as possible from the available data. A large number of methods founded on these concepts have been proposed in the literature, not only in biology journals, but in a wide range of areas. Their critical comparison is difficult due to the different focuses and the adoption of different terminologies. Here we attempt to review some of the existing information theoretic methodologies for network inference, and clarify their differences. While some of these methods have achieved notable success, many challenges remain, among which we can mention dealing with incomplete measurements, noisy data, counterintuitive behaviour emerging from nonlinear relations or feedback loops, and computational burden of dealing with large data sets.
Directory of Open Access Journals (Sweden)
Lester L. Yuan
2007-06-01
Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.
Directory of Open Access Journals (Sweden)
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Opinion Dynamics on Networks with Inference of Unobservable States of Others
Fujie, Ryo
In most opinion formation models which have been proposed, the agents decide their states (i.e. opinions) by referring to the states of others. However, the referred states of others are not necessarily observable and may be inferred. To investigate the effect of an inference of the states of others on opinion dynamics, I propose an extended voter model on networks where observable and referable node sets are different. These sets for a node defined as the nearest to the mo-th neighbors for observable nodes and the nearest to the mr-th neighbors for referable nodes. The state of referable but unobservable node which is the m-th neighbor (mo pagerank'' is conserved. This conserved quantity coincides with the fixation probability. On the other hand, in the case of mo =mr = 1 , the model comes down to the standard voter model on networks and the conserved quantity is a degree-weighted superposition of the states. Thus, the introduction of the inference changes the important opinion spreaders from the high-degree nodes to the high-betweenness pagerank nodes. This work is supported by the Collaboration Research Program of IDEAS, Chubu University IDEAS2016233.
Inferring biological functions of guanylyl cyclases with computational methods
Alquraishi, May Majed
2013-09-03
A number of studies have shown that functionally related genes are often co-expressed and that computational based co-expression analysis can be used to accurately identify functional relationships between genes and by inference, their encoded proteins. Here we describe how a computational based co-expression analysis can be used to link the function of a specific gene of interest to a defined cellular response. Using a worked example we demonstrate how this methodology is used to link the function of the Arabidopsis Wall-Associated Kinase-Like 10 gene, which encodes a functional guanylyl cyclase, to host responses to pathogens. © Springer Science+Business Media New York 2013.
RENT+: an improved method for inferring local genealogical trees from haplotypes with recombination.
Mirzaei, Sajad; Wu, Yufeng
2017-04-01
: Haplotypes from one or multiple related populations share a common genealogical history. If this shared genealogy can be inferred from haplotypes, it can be very useful for many population genetics problems. However, with the presence of recombination, the genealogical history of haplotypes is complex and cannot be represented by a single genealogical tree. Therefore, inference of genealogical history with recombination is much more challenging than the case of no recombination. : In this paper, we present a new approach called RENT+ for the inference of local genealogical trees from haplotypes with the presence of recombination. RENT+ builds on a previous genealogy inference approach called RENT , which infers a set of related genealogical trees at different genomic positions. RENT+ represents a significant improvement over RENT in the sense that it is more effective in extracting information contained in the haplotype data about the underlying genealogy than RENT . The key components of RENT+ are several greatly enhanced genealogy inference rules. Through simulation, we show that RENT+ is more efficient and accurate than several existing genealogy inference methods. As an application, we apply RENT+ in the inference of population demographic history from haplotypes, which outperforms several existing methods. : RENT+ is implemented in Java, and is freely available for download from: https://github.com/SajadMirzaei/RentPlus . : sajad@engr.uconn.edu or ywu@engr.uconn.edu. : Supplementary data are available at Bioinformatics online.
Sahoo, Ramendra; Jain, Vikrant
2017-04-01
Morphology of the landscape and derived features are regarded to be an important tool for inferring about tectonic activity in an area, since surface exposures of these subsurface processes may not be available or may get eroded away over time. This has led to an extensive research in application of the non-planar morphological attributes like river long profile and hypsometry for tectonic studies, whereas drainage network as a proxy for tectonic activity has not been explored greatly. Though, significant work has been done on drainage network pattern which started in a qualitative manner and over the years, has evolved to incorporate more quantitative aspects, like studying the evolution of a network under the influence of external and internal controls. Random Topology (RT) model is one of these concepts, which elucidates the connection between evolution of a drainage network pattern and the entropy of the drainage system and it states that in absence of any geological controls, a natural population of channel networks will be topologically random. We have used the entropy maximization principle to provide a theoretical structure for the RT model. Furthermore, analysis was carried out on the drainage network structures around Jwalamukhi thrust in the Kangra reentrant in western Himalayas, India, to investigate the tectonic activity in the region. Around one thousand networks were extracted from the foot-wall (fw) and hanging-wall (hw) region of the thrust sheet and later categorized based on their magnitudes. We have adopted the goodness of fit test for comparing the network patterns in fw and hw drainage with those derived using the RT model. The null hypothesis for the test was, the drainage networks in the fw are statistically more similar than those on the hw, to the network patterns derived using the RT model for any given magnitude. The test results are favorable to our null hypothesis for networks with smaller magnitudes (< 9), whereas for larger
Lo, Benjamin W Y; Macdonald, R Loch; Baker, Andrew; Levine, Mitchell A H
2013-01-01
The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH). The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients). Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs). Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique) denoted cut-off points for poor prognosis at greater than 2.5 clusters. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Statistical Sensitive Data Protection and Inference Prevention with Decision Tree Methods
National Research Council Canada - National Science Library
Chang, LiWu
2003-01-01
.... We consider inference as correct classification and approach it with decision tree methods. As in our previous work, sensitive data are viewed as classes of those test data and non-sensitive data are the rest attribute values...
An Efficient Forward-Reverse EM Algorithm for Statistical Inference in Stochastic Reaction Networks
Bayer, Christian
2016-01-06
In this work [1], we present an extension of the forward-reverse algorithm by Bayer and Schoenmakers [2] to the context of stochastic reaction networks (SRNs). We then apply this bridge-generation technique to the statistical inference problem of approximating the reaction coefficients based on discretely observed data. To this end, we introduce an efficient two-phase algorithm in which the first phase is deterministic and it is intended to provide a starting point for the second phase which is the Monte Carlo EM Algorithm.
Directory of Open Access Journals (Sweden)
Amol P. Bhondekar
2010-03-01
Full Text Available Sensor deployment scheme highly governs the effectiveness of distributed wireless sensor network. Issues such as energy conservation and clustering make the deployment problem much more complex. A multiobjective Fuzzy Inference System based strategy for mobile sensor deployment is presented in this paper. This strategy gives a synergistic combination of energy capacity, clustering and peer-to-peer deployment. Performance of our strategy is evaluated in terms of coverage, uniformity, speed and clustering. Our algorithm is compared against a modified distributed self-spreading algorithm to exhibit better performance.
Modular Semantic Tagging of Medline Abstracts and its Use in Inferring Regulatory Networks
Energy Technology Data Exchange (ETDEWEB)
Verhagen, Marc; Pustejovsky, James; Taylor, Ronald C.; Sanfilippo, Antonio P.
2011-09-19
We describe MedstractPlus, a resource for mining relations from the Medline bibliographic database that is currently under construction. It was built on the remains of Medstract, a previously created resource that included a biorelation server and an acronym database. MedstractPlus uses simple and scalable natural language processing modules to structure text, is designed with reusability and extendibility in mind, and adheres to the philosophy of the Linguistic Annotation Framework. We show how MedstractPlus has been used to provide seeds for a novel approach to inferring transcriptional regulatory networks from gene expression data.
Directory of Open Access Journals (Sweden)
Fahrur Rozi
2016-12-01
Full Text Available Kebutuhan akan prediksi sangat diperlukan diberbagai sektor kehidupan, salah satunya adalah mengenai prediksi cuaca. Prediksi mengenai cuaca dapat dilakukan dalam rentang waktu tertentu, sehingga untuk dapat memprediksi keadaan cuaca dalam rentang waktu tertentu penelitian ini akan menggunakan moving average dengan metode hybrid artificial neural network dan fuzzy inference system. Data yang digunakan berasal dari BMKG Karangploso, Malang dengan menggunakan empat buah parameter yang mempengaruhi kondisi cuaca, yaitu suhu, tekanan udara, kelembapan udara, dan kecepatan angin. Performa model menghasilkan tingkat akurasi mencapai 73.91 %.
Xing, Linlin; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Wang, Lei; Zhang, Yin
2017-11-17
The reconstruction of gene regulatory network (GRN) from gene expression data can discover regulatory relationships among genes and gain deep insights into the complicated regulation mechanism of life. However, it is still a great challenge in systems biology and bioinformatics. During the past years, numerous computational approaches have been developed for this goal, and Bayesian network (BN) methods draw most of attention among these methods because of its inherent probability characteristics. However, Bayesian network methods are time consuming and cannot handle large-scale networks due to their high computational complexity, while the mutual information-based methods are highly effective but directionless and have a high false-positive rate. To solve these problems, we propose a Candidate Auto Selection algorithm (CAS) based on mutual information and breakpoint detection to restrict the search space in order to accelerate the learning process of Bayesian network. First, the proposed CAS algorithm automatically selects the neighbor candidates of each node before searching the best structure of GRN. Then based on CAS algorithm, we propose a globally optimal greedy search method (CAS + G), which focuses on finding the highest rated network structure, and a local learning method (CAS + L), which focuses on faster learning the structure with little loss of quality. Results show that the proposed CAS algorithm can effectively reduce the search space of Bayesian networks through identifying the neighbor candidates of each node. In our experiments, the CAS + G method outperforms the state-of-the-art method on simulation data for inferring GRNs, and the CAS + L method is significantly faster than the state-of-the-art method with little loss of accuracy. Hence, the CAS based methods effectively decrease the computational complexity of Bayesian network and are more suitable for GRN inference.
Note on neural network sampling for Bayesian inference of mixture processes
L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman)
2007-01-01
textabstractIn this paper we show some further experiments with neural network sampling, a class of sampling methods that make use of neural network approximations to (posterior) densities, introduced by Hoogerheide et al. (2007). We consider a method where a mixture of Student's t densities, which
Non-linear methods for inferring lidar metrics using SPOT-5 textural data
Directory of Open Access Journals (Sweden)
A. Shamsoddini
2013-10-01
Full Text Available Although many studies have demonstrated the utility of airborne lidar for forest inventory, the acquisition and processing of the data can be cost prohibitive for small areas. In such cases, it may be possible to emulate lidar metrics using more affordable optical data. This study explored processing methods for predicting lidar metrics using SPOT-5 textural data. Multiple-linear regression (MLR was compared with non-linear machine learning techniques including multi-layer perceptron (MLP artificial neural networks (ANN, rational basis function (RBF ANN and regression tree (RT. For this purpose, 11 grey level co-occurrence matrix (GLCM indices were calculated for bands, band ratios and principal components (PCs of SPOT-5 multispectral image. SPOT-5 metrics were correlated with 25 lidar metrics collected over a Pinus radiata plantation. After dimensionality reduction, random forest feature selection was applied to select the most relevant SPOT-5 textural attributes for inferring each lidar metric. The results showed that the non-linear methods including MLP and RBF methods are more promising for modelling lidar metrics using SPOT-5 data than MLR and RT.
Hiratani, Naoki; Fukai, Tomoki
2016-01-01
In the adult mammalian cortex, a small fraction of spines are created and eliminated every day, and the resultant synaptic connection structure is highly nonrandom, even in local circuits. However, it remains unknown whether a particular synaptic connection structure is functionally advantageous in local circuits, and why creation and elimination of synaptic connections is necessary in addition to rich synaptic weight plasticity. To answer these questions, we studied an inference task model through theoretical and numerical analyses. We demonstrate that a robustly beneficial network structure naturally emerges by combining Hebbian-type synaptic weight plasticity and wiring plasticity. Especially in a sparsely connected network, wiring plasticity achieves reliable computation by enabling efficient information transmission. Furthermore, the proposed rule reproduces experimental observed correlation between spine dynamics and task performance.
Acerbi, Enzo; Zelante, Teresa; Narang, Vipin; Stella, Fabio
2014-12-11
Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time. However, nowadays omics data are being generated with increasing time course granularity. Thus, modellers now have the possibility to represent the system as evolving in continuous time and to improve the models' expressiveness. Continuous time Bayesian networks are proposed as a new approach for gene network reconstruction from time course expression data. Their performance was compared to two state-of-the-art methods: dynamic Bayesian networks and Granger causality analysis. On simulated data, the methods comparison was carried out for networks of increasing size, for measurements taken at different time granularity densities and for measurements unevenly spaced over time. Continuous time Bayesian networks outperformed the other methods in terms of the accuracy of regulatory interactions learnt from data for all network sizes. Furthermore, their performance degraded smoothly as the size of the network increased. Continuous time Bayesian networks were significantly better than dynamic Bayesian networks for all time granularities tested and better than Granger causality for dense time series. Both continuous time Bayesian networks and Granger causality performed robustly for unevenly spaced time series, with no significant loss of performance compared to the evenly spaced case, while the same did not hold true for dynamic Bayesian networks. The comparison included the IRMA experimental datasets which confirmed the effectiveness of the proposed method. Continuous time Bayesian networks were then applied to elucidate the regulatory mechanisms controlling murine T helper 17 (Th17) cell differentiation and were found to be effective in
Methods for Analyzing Pipe Networks
DEFF Research Database (Denmark)
Nielsen, Hans Bruun
1989-01-01
The governing equations for a general network are first set up and then reformulated in terms of matrices. This is developed to show that the choice of model for the flow equations is essential for the behavior of the iterative method used to solve the problem. It is shown that it is better to fo...... demonstrated that this method offers good starting values for a Newton-Raphson iteration.......The governing equations for a general network are first set up and then reformulated in terms of matrices. This is developed to show that the choice of model for the flow equations is essential for the behavior of the iterative method used to solve the problem. It is shown that it is better...... to formulate the flow equations in terms of pipe discharges than in terms of energy heads. The behavior of some iterative methods is compared in the initial phase with large errors. It is explained why the linear theory method oscillates when the iteration gets close to the solution, and it is further...
Evaluating the Limits of Network Topology Inference Via Virtualized Network Emulation
2015-06-01
activities from identifying Internet censorship under oppressive regimes and tracking the Internet’s penetration into previously unserved countries and...AVAILABILITY STATEMENT Approved for public release; distribution is unlimited 12b. DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) The Internet ...ability to induce link failures within the network. In addition, this thesis reexamines previous work in sampling Autonomous System-level Internet
A Systematic, Automated Network Planning Method
DEFF Research Database (Denmark)
Holm, Jens Åge; Pedersen, Jens Myrup
2006-01-01
This paper describes a case study conducted to evaluate the viability of a systematic, automated network planning method. The motivation for developing the network planning method was that many data networks are planned in an adhoc manner with no assurance of quality of the solution with respect...... to consistency and long-term characteristics. The developed method gives significant improvements on these parameters. The case study was conducted as a comparison between an existing network where the traffic was known and a proposed network designed by the developed method. It turned out that the proposed...... network performed better than the existing network with regard to the performance measurements used which reflected how well the traffic was routed in the networks and the cost of establishing the networks. Challenges that need to be solved before the developed method can be used to design network...
Generalized Bootstrap Method for Assessment of Uncertainty in Semivariogram Inference
Olea, R.A.; Pardo-Iguzquiza, E.
2011-01-01
The semivariogram and its related function, the covariance, play a central role in classical geostatistics for modeling the average continuity of spatially correlated attributes. Whereas all methods are formulated in terms of the true semivariogram, in practice what can be used are estimated semivariograms and models based on samples. A generalized form of the bootstrap method to properly model spatially correlated data is used to advance knowledge about the reliability of empirical semivariograms and semivariogram models based on a single sample. Among several methods available to generate spatially correlated resamples, we selected a method based on the LU decomposition and used several examples to illustrate the approach. The first one is a synthetic, isotropic, exhaustive sample following a normal distribution, the second example is also a synthetic but following a non-Gaussian random field, and a third empirical sample consists of actual raingauge measurements. Results show wider confidence intervals than those found previously by others with inadequate application of the bootstrap. Also, even for the Gaussian example, distributions for estimated semivariogram values and model parameters are positively skewed. In this sense, bootstrap percentile confidence intervals, which are not centered around the empirical semivariogram and do not require distributional assumptions for its construction, provide an achieved coverage similar to the nominal coverage. The latter cannot be achieved by symmetrical confidence intervals based on the standard error, regardless if the standard error is estimated from a parametric equation or from bootstrap. ?? 2010 International Association for Mathematical Geosciences.
A statistical inference method for the stochastic reachability analysis
Bujorianu, L.M.
2005-01-01
Many control systems have large, infinite state space that can not be easily abstracted. One method to analyse and verify these systems is reachability analysis. It is frequently used for air traffic control and power plants. Because of lack of complete information about the environment or
Inferring protein function by domain context similarities in protein-protein interaction networks
Directory of Open Access Journals (Sweden)
Sun Zhirong
2009-12-01
Full Text Available Abstract Background Genome sequencing projects generate massive amounts of sequence data but there are still many proteins whose functions remain unknown. The availability of large scale protein-protein interaction data sets makes it possible to develop new function prediction methods based on protein-protein interaction (PPI networks. Although several existing methods combine multiple information resources, there is no study that integrates protein domain information and PPI networks to predict protein functions. Results The domain context similarity can be a useful index to predict protein function similarity. The prediction accuracy of our method in yeast is between 63%-67%, which outperforms the other methods in terms of ROC curves. Conclusion This paper presents a novel protein function prediction method that combines protein domain composition information and PPI networks. Performance evaluations show that this method outperforms existing methods.
A Photometric Machine-Learning Method to Infer Stellar Metallicity
Miller, Adam A.
2015-01-01
Following its formation, a star's metal content is one of the few factors that can significantly alter its evolution. Measurements of stellar metallicity ([Fe/H]) typically require a spectrum, but spectroscopic surveys are limited to a few x 10(exp 6) targets; photometric surveys, on the other hand, have detected > 10(exp 9) stars. I present a new machine-learning method to predict [Fe/H] from photometric colors measured by the Sloan Digital Sky Survey (SDSS). The training set consists of approx. 120,000 stars with SDSS photometry and reliable [Fe/H] measurements from the SEGUE Stellar Parameters Pipeline (SSPP). For bright stars (g' learning method is similar to the scatter in [Fe/H] measurements from low-resolution spectra..
A Photometric Machine-Learning Method to Infer Stellar Metallicity
Miller, Adam A.
2015-01-01
Following its formation, a star's metal content is one of the few factors that can significantly alter its evolution. Measurements of stellar metallicity ([Fe/H]) typically require a spectrum, but spectroscopic surveys are limited to a few x 10(exp 6) targets; photometric surveys, on the other hand, have detected > 10(exp 9) stars. I present a new machine-learning method to predict [Fe/H] from photometric colors measured by the Sloan Digital Sky Survey (SDSS). The training set consists of approx. 120,000 stars with SDSS photometry and reliable [Fe/H] measurements from the SEGUE Stellar Parameters Pipeline (SSPP). For bright stars (g' machine-learning method is similar to the scatter in [Fe/H] measurements from low-resolution spectra..
Fast State-Space Methods for Inferring Dendritic Synaptic Connectivity
2013-08-08
observations yt into the ST -vector Y . The complete log-likelihood for the combined V and Y variables is ( Durbin et al., 2001) log p(Y, V |W ) = log p(Y...1003. Durbin , J., Koopman, S. and Atkinson, A. (2001), Time series analysis by state space methods, Vol. 15, Oxford University Press Oxford. Efron, B...Association 83(404), 1023–1032. Nikolenko, V., Watson , B., Araya, R., Woodruff, A., Peterka, D. and Yuste, R. (2008), ‘SLM mi- croscopy: Scanless two
Computational methods for analysis and inference of kinase/inhibitor relationships
Directory of Open Access Journals (Sweden)
Fabrizio eFerrè
2014-06-01
Full Text Available The central role of kinases in virtually all signal transduction networks is the driving motivation for the development of compounds modulating their activity. ATP-mimetic inhibitors are essential tools for elucidating signaling pathways and are emerging as promising therapeutic agents. However, off-target ligand binding and complex and sometimes unexpected kinase/inhibitor relationships can occur for seemingly unrelated kinases, stressing that computational approaches are needed for learning the interaction determinants and for the inference of the effect of small compounds on a given kinase. Recently published high-throughput profiling studies assessed the effects of thousands of small compound inhibitors, covering a substantial portion of the kinome. This wealth of data paved the road for computational resources and methods that can offer a major contribution in understanding the reasons of the inhibition, helping in the rational design of more specific molecules, in the in silico prediction of inhibition for those neglected kinases for which no systematic analysis has been carried yet, in the selection of novel inhibitors with desired selectivity, and offering novel avenues of personalized therapies.
Bayesian network reconstruction using systems genetics data: comparison of MCMC methods.
Tasaki, Shinya; Sauerwine, Ben; Hoff, Bruce; Toyoshiba, Hiroyoshi; Gaiteri, Chris; Chaibub Neto, Elias
2015-04-01
Reconstructing biological networks using high-throughput technologies has the potential to produce condition-specific interactomes. But are these reconstructed networks a reliable source of biological interactions? Do some network inference methods offer dramatically improved performance on certain types of networks? To facilitate the use of network inference methods in systems biology, we report a large-scale simulation study comparing the ability of Markov chain Monte Carlo (MCMC) samplers to reverse engineer Bayesian networks. The MCMC samplers we investigated included foundational and state-of-the-art Metropolis-Hastings and Gibbs sampling approaches, as well as novel samplers we have designed. To enable a comprehensive comparison, we simulated gene expression and genetics data from known network structures under a range of biologically plausible scenarios. We examine the overall quality of network inference via different methods, as well as how their performance is affected by network characteristics. Our simulations reveal that network size, edge density, and strength of gene-to-gene signaling are major parameters that differentiate the performance of various samplers. Specifically, more recent samplers including our novel methods outperform traditional samplers for highly interconnected large networks with strong gene-to-gene signaling. Our newly developed samplers show comparable or superior performance to the top existing methods. Moreover, this performance gain is strongest in networks with biologically oriented topology, which indicates that our novel samplers are suitable for inferring biological networks. The performance of MCMC samplers in this simulation framework can guide the choice of methods for network reconstruction using systems genetics data. Copyright © 2015 by the Genetics Society of America.
Challenges to inferring causality from viral information dispersion in dynamic social networks
Ternovski, John
2014-06-01
Understanding the mechanism behind large-scale information dispersion through complex networks has important implications for a variety of industries ranging from cyber-security to public health. With the unprecedented availability of public data from online social networks (OSNs) and the low cost nature of most OSN outreach, randomized controlled experiments, the "gold standard" of causal inference methodologies, have been used with increasing regularity to study viral information dispersion. And while these studies have dramatically furthered our understanding of how information disseminates through social networks by isolating causal mechanisms, there are still major methodological concerns that need to be addressed in future research. This paper delineates why modern OSNs are markedly different from traditional sociological social networks and why these differences present unique challenges to experimentalists and data scientists. The dynamic nature of OSNs is particularly troublesome for researchers implementing experimental designs, so this paper identifies major sources of bias arising from network mutability and suggests strategies to circumvent and adjust for these biases. This paper also discusses the practical considerations of data quality and collection, which may adversely impact the efficiency of the estimator. The major experimental methodologies used in the current literature on virality are assessed at length, and their strengths and limits identified. Other, as-yetunsolved threats to the efficiency and unbiasedness of causal estimators--such as missing data--are also discussed. This paper integrates methodologies and learnings from a variety of fields under an experimental and data science framework in order to systematically consolidate and identify current methodological limitations of randomized controlled experiments conducted in OSNs.
Schmit, C. J.; Pritchard, J. R.
2018-03-01
Next generation radio experiments such as LOFAR, HERA, and SKA are expected to probe the Epoch of Reionization (EoR) and claim a first direct detection of the cosmic 21cm signal within the next decade. Data volumes will be enormous and can thus potentially revolutionize our understanding of the early Universe and galaxy formation. However, numerical modelling of the EoR can be prohibitively expensive for Bayesian parameter inference and how to optimally extract information from incoming data is currently unclear. Emulation techniques for fast model evaluations have recently been proposed as a way to bypass costly simulations. We consider the use of artificial neural networks as a blind emulation technique. We study the impact of training duration and training set size on the quality of the network prediction and the resulting best-fitting values of a parameter search. A direct comparison is drawn between our emulation technique and an equivalent analysis using 21CMMC. We find good predictive capabilities of our network using training sets of as low as 100 model evaluations, which is within the capabilities of fully numerical radiative transfer codes.
Energy Technology Data Exchange (ETDEWEB)
Karri, Vishy; Ho, Tien [School of Engineering, University of Tasmania, GPO Box 252-65, Hobart, Tasmania 7001 (Australia); Madsen, Ole [Department of Production, Aalborg University, Fibigerstraede 16, DK-9220 Aalborg (Denmark)
2008-06-15
Hydrogen is increasingly investigated as an alternative fuel to petroleum products in running internal combustion engines and as powering remote area power systems using generators. The safety issues related to hydrogen gas are further exasperated by expensive instrumentation required to measure the percentage of explosive limits, flow rates and production pressure. This paper investigates the use of model based virtual sensors (rather than expensive physical sensors) in connection with hydrogen production with a Hogen 20 electrolyzer system. The virtual sensors are used to predict relevant hydrogen safety parameters, such as the percentage of lower explosive limit, hydrogen pressure and hydrogen flow rate as a function of different input conditions of power supplied (voltage and current), the feed of de-ionized water and Hogen 20 electrolyzer system parameters. The virtual sensors are developed by means of the application of various Artificial Intelligent techniques. To train and appraise the neural network models as virtual sensors, the Hogen 20 electrolyzer is instrumented with necessary sensors to gather experimental data which together with MATLAB neural networks toolbox and tailor made adaptive neuro-fuzzy inference systems (ANFIS) were used as predictive tools to estimate hydrogen safety parameters. It was shown that using the neural networks hydrogen safety parameters were predicted to less than 3% of percentage average root mean square error. The most accurate prediction was achieved by using ANFIS. (author)
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Inference of the sparse kinetic Ising model using the decimation method.
Decelle, Aurélien; Zhang, Pan
2015-05-01
In this paper we study the inference of the kinetic Ising model on sparse graphs by the decimation method. The decimation method, which was first proposed in Decelle and Ricci-Tersenghi [Phys. Rev. Lett. 112, 070603 (2014)] for the static inverse Ising problem, tries to recover the topology of the inferred system by setting the weakest couplings to zero iteratively. During the decimation process the likelihood function is maximized over the remaining couplings. Unlike the ℓ(1)-optimization-based methods, the decimation method does not use the Laplace distribution as a heuristic choice of prior to select a sparse solution. In our case, the whole process can be done auto-matically without fixing any parameters by hand. We show that in the dynamical inference problem, where the task is to reconstruct the couplings of an Ising model given the data, the decimation process can be applied naturally into a maximum-likelihood optimization algorithm, as opposed to the static case where pseudolikelihood method needs to be adopted. We also use extensive numerical studies to validate the accuracy of our methods in dynamical inference problems. Our results illustrate that, on various topologies and with different distribution of couplings, the decimation method outperforms the widely used ℓ(1)-optimization-based methods.
Inferring the mesoscale structure of layered, edge-valued and time-varying networks
Peixoto, Tiago P
2015-01-01
Many network systems are composed of interdependent but distinct types of interactions, which cannot be fully understood in isolation. These different types of interactions are often represented as layers, attributes on the edges or as a time-dependence of the network structure. Although they are crucial for a more comprehensive scientific understanding, these representations offer substantial challenges. Namely, it is an open problem how to precisely characterize the large or mesoscale structure of network systems in relation to these additional aspects. Furthermore, the direct incorporation of these features invariably increases the effective dimension of the network description, and hence aggravates the problem of overfitting, i.e. the use of overly-complex characterizations that mistake purely random fluctuations for actual structure. In this work, we propose a robust and principled method to tackle these problems, by constructing generative models of modular network structure, incorporating layered, attr...
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Inference of Extreme Synchrony with an Entropy Measure on a Bipartite Network
Sato, Aki-Hiro
2012-01-01
This article proposes a method to quantify the structure of a bipartite graph with a network entropy from a statistical--physical point of view. The network entropy of a bipartite graph with random links is computed from numerical simulation. As an application of the proposed method to analyze collective behavior, the affairs in which participants quote and trade in the foreign exchange market are quantified. The network entropy per node is found to correspond to the macroeconomic situation. A finite mixture of Gumbel distributions is used to fit with the empirical distribution for the minimum values of network entropy per node in each week. The mixture of Gumbel distributions with parameter estimates by segmentation procedure is verified by Kolmogorov--Smirnov test. The finite mixture of Gumbel distributions can extrapolate the probability of extreme events that have never been observed.
Random Walks on Directed Networks: Inference and Respondent-driven Sampling
Malmros, Jens; Britton, Tom
2013-01-01
Respondent driven sampling (RDS) is a method often used to estimate population properties (e.g. sexual risk behavior) in hard-to-reach populations. It combines an effective modified snowball sampling methodology with an estimation procedure that yields unbiased population estimates under the assumption that the sampling process behaves like a random walk on the social network of the population. Current RDS estimation methodology assumes that the social network is undirected, i.e. that all edges are reciprocal. However, empirical social networks in general also have non-reciprocated edges. To account for this fact, we develop a new estimation method for RDS in the presence of directed edges on the basis of random walks on directed networks. We distinguish directed and undirected edges and consider the possibility that the random walk returns to its current position in two steps through an undirected edge. We derive estimators of the selection probabilities of individuals as a function of the number of outgoing...
Gong, Junhui; Liu, Xiaoyan; Liu, Tianming; Zhou, Jiansong; Sun, Gang; Tian, Juanxiu
2017-08-09
Recently, sparse representation has been successfully used to identify brain networks from task-based fMRI dataset. However, when using the strategy to analyze resting-state fMRI dataset, it is still a challenge to automatically infer the group-wise brain networks under consideration of group commonalities and subject-specific characteristics. In the paper, a novel method based on dual temporal and spatial sparse representation (DTSSR) is proposed to meet this challenge. Firstly, the brain functional networks with subject-specific characteristics are obtained via sparse representation with online dictionary learning for the fMRI time series (temporal domain) of each subject. Next, based on the current brain science knowledge, a simple mathematical model is proposed to describe the complex nonlinear dynamic coupling mechanism of the brain networks, with which the group-wise intrinsic connectivity networks (ICNs) can be inferred by sparse representation for these brain functional networks (spatial domain) of all subjects. Experiments on Leiden_2180 dataset show that most group-wise ICNs obtained by the proposed DTSSR are interpretable by current brain science knowledge and are consistent with previous literature reports. The robustness of DTSSR and the reproducibility of the results are demonstrated by experiments on three different datasets (Leiden_2180, Leiden_2200 and our own dataset). Results of the present work shed new light on exploring the coupling mechanism of BFNs from perspective of information science.
Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives.
Ramstetter, Monica D; Dyer, Thomas D; Lehman, Donna M; Curran, Joanne E; Duggirala, Ravindranath; Blangero, John; Mezey, Jason G; Williams, Amy L
2017-09-01
Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to 76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance. Copyright © 2017 Ramstetter et al.
Sensor Network Data Fusion Methods
Directory of Open Access Journals (Sweden)
Martynas Vervečka
2011-03-01
Full Text Available Sensor network data fusion is widely used in warfare, in areas such as automatic target recognition, battlefield surveillance, automatic vehicle control, multiple target surveillance, etc. Non-military use example are: medical equipment status monitoring, intelligent home. The paper describes sensor networks topologies, sensor network advantages against the isolated sensors, most common network topologies, their advantages and disadvantages.Article in Lithuanian
Directory of Open Access Journals (Sweden)
Ying-Yi Hong
2014-04-01
Full Text Available Microgrids are a highly efficient means of embedding distributed generation sources in a power system. However, if a fault occurs inside or outside the microgrid, the microgrid should be immediately disconnected from the main grid using a static switch installed at the secondary side of the main transformer near the point of common coupling (PCC. The static switch should have a reliable module implemented in a chip to detect/locate the fault and activate the breaker to open the circuit immediately. This paper proposes a novel approach to design this module in a static switch using the discrete wavelet transform (DWT and adaptive network-based fuzzy inference system (ANFIS. The wavelet coefficient of the fault voltage and the inference results of ANFIS with the wavelet energy of the fault current at the secondary side of the main transformer determine the control action (open or close of a static switch. The ANFIS identifies the faulty zones inside or outside the microgrid. The proposed method is applied to the first outdoor microgrid test bed in Taiwan, with a generation capacity of 360.5 kW. This microgrid test bed is studied using the real-time simulator eMegaSim developed by Opal-RT Technology Inc. (Montreal, QC, Canada. The proposed method based on DWT and ANFIS is implemented in a field programmable gate array (FPGA by using the Xilinx System Generator. Simulation results reveal that the proposed method is efficient and applicable in the real-time control environment of a power system.
Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data
Directory of Open Access Journals (Sweden)
Bo Yang
2013-01-01
Full Text Available Great efforts have been devoted to alleviate uncertainty of detected cancer genes as accurate identification of oncogenes is of tremendous significance and helps unravel the biological behavior of tumors. In this paper, we present a differential network-based framework to detect biologically meaningful cancer-related genes. Firstly, a gene regulatory network construction algorithm is proposed, in which a boosting regression based on likelihood score and informative prior is employed for improving accuracy of identification. Secondly, with the algorithm, two gene regulatory networks are constructed from case and control samples independently. Thirdly, by subtracting the two networks, a differential-network model is obtained and then used to rank differentially expressed hub genes for identification of cancer biomarkers. Compared with two existing gene-based methods (t-test and lasso, the method has a significant improvement in accuracy both on synthetic datasets and two real breast cancer datasets. Furthermore, identified six genes (TSPYL5, CD55, CCNE2, DCK, BBC3, and MUC1 susceptible to breast cancer were verified through the literature mining, GO analysis, and pathway functional enrichment analysis. Among these oncogenes, TSPYL5 and CCNE2 have been already known as prognostic biomarkers in breast cancer, CD55 has been suspected of playing an important role in breast cancer prognosis from literature evidence, and other three genes are newly discovered breast cancer biomarkers. More generally, the differential-network schema can be extended to other complex diseases for detection of disease associated-genes.
DEFF Research Database (Denmark)
2013-01-01
The present invention relates to a method, computer program and system for inferring relations between cultural specific concepts (CSC) in two cultures at least comprising the steps of - extracting and listing said cultural specific concepts (CSCs) and features of said CSCs from at least a first...
A haplotype inference method based on sparsely connected multi-body ising model
Energy Technology Data Exchange (ETDEWEB)
Kato, Masashi; Gao, Qian Ji; Chigira, Hiroshi; Shindo, Hiroyuki; Inoue, Masato, E-mail: masato.inoue@eb.waseda.ac.j [Department of Electrical Engineering and Bioscience, (Graduate) School of Advanced Science and Engineering, Waseda University, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555 (Japan)
2010-06-01
Statistical haplotype inference is an indispensable technique in the field of medical science. The method usually has two steps: inference of haplotype frequencies and inference of diplotype for each subject. The first step can be done by using the expectation-maximization (EM) algorithm, but it incurs an unreasonably large calculation cost when the number of single-nucleotide polymorphism (SNP) loci of concern is large. In this article, we describe an approximate probabilistic model of haplotype frequencies. The model is constructed by using several distributions of nearby local SNPs. This approximation seems good because SNPs are generally more strongly correlated when they are close to one another on a chromosome. To implement this approach, we use a log linear model, the Walsh-Hadamard transform, and a combinatorial optimization method. Artificial data suggested that the overall haplotype inference of our method is good if there are nine or more local consecutive SNPs. Some minor problems should be dealt with before this method can be applied to real data.
Chen, Xi; Gu, Jinghua; Wang, Xiao; Jung, Jin-Gyoung; Wang, Tian-Li; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua
2017-12-21
NGS techniques have been widely applied in genetic and epigenetic studies. Multiple ChIP-seq and RNA-seq profiles can now be jointly used to infer functional regulatory networks (FRNs). However, existing methods suffer from either oversimplified assumption on transcription factor (TF) regulation or slow convergence of sampling for FRN inference from large-scale ChIP-seq and time-course RNA-seq data. We developed an efficient Bayesian integration method (CRNET) for FRN inference using a two-stage Gibbs sampler to estimate iteratively hidden TF activities and the posterior probabilities of binding events. A novel statistic measure that jointly considers regulation strength and regression error enables the sampling process of CRNET to converge quickly, thus making CRNET very efficient for large-scale FRN inference. Experiments on synthetic and benchmark data showed a significantly improved performance of CRNET when compared with existing methods. CRNET was applied to breast cancer data to identify FRNs functional at promoter or enhancer regions in breast cancer MCF-7 cells. Transcription factor MYC is predicted as a key functional factor in both promoter and enhancer FRNs. We experimentally validated the regulation effects of MYC on CRNET-predicted target genes using appropriate RNAi approaches in MCF-7 cells. R scripts of CRNET are available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu. Supplementary data are available at Bioinformatics online.
Inferring the interplay of network structure and market effects in Bitcoin
Kondor, Dániel; Szüle, János; Pósfai, Márton; Vattay, Gábor
2014-01-01
A main focus in economics research is understanding the time series of prices of goods and assets. While statistical models using only the properties of the time series itself have been successful in many aspects, we expect to gain a better understanding of the phenomena involved if we can model the underlying system of interacting agents. In this article, we consider the history of Bitcoin, a novel digital currency system, for which the complete list of transactions is available for analysis. Using this dataset, we reconstruct the transaction network between users and analyze changes in the structure of the subgraph induced by the most active users. Our approach is based on the unsupervised identification of important features of the time variation of the network. Applying the widely used method of Principal Component Analysis to the matrix constructed from snapshots of the network at different times, we are able to show how structural changes in the network accompany significant changes in the exchange pric...
Designs and Methods for Association Studies and Population Size Inference in Statistical Genetics
DEFF Research Database (Denmark)
Waltoft, Berit Lindum
estimator of the IRR. The dierence between the OR and the IRR is re ected in the p-value of the null hypothesis of no exposure eect. For multiple testing scenarios, e.g. in a GWAS, these dierences in estimators imply a change in comparison between the null hypotheses for dierent sampling schemes of controls...... method provides a simple goodness of t test by comparing the observed SFS with the expected SFS under a given model of population size changes. By the use of Monte Carlo estimation the expected time between coalescent events can be estimated and the expected SFS can thereby be evaluated. Using...... the classical chi-square statistics we are able to infer single parameter models. Multiple parameter models, e.g. multiple epochs, are harder to identify. By introducing the inference of population size back in time as an inverse problem, the second procedure applies the theory of smoothing splines to infer...
Bermudez Corrales, Ignacio Nicolas
2013-01-01
The Internet is evolving with us along the time, nowadays people are more dependent of it, being used for most of the simple activities of their lives. It is not uncommon use the Internet for voice and video communications, social networking, banking and shopping. Current trends in Internet applications such as Web 2.0, cloud computing, and the internet of things are bound to bring higher traffic volume and more heterogeneous traffic. In addition, privacy concerns and network security traits ...
Pavone, Andrea; Svensson, Jakob; Langenberg, Andreas; Pablant, Novimir; Wolf, Robert C.
2017-10-01
Artificial neural networks (ANNs) can reduce the computation time required for the application of Bayesian inference on large amounts of data by several orders of magnitude, making real-time analysis possible and, at the same time, providing a reliable alternative to more conventional inversion routines. The large scale fusion experiment Wendelstein 7-X (W7-X) requires tens of diagnostics for plasma parameter measurements and is using the Minerva Bayesian modelling framework as its main inference engine, which can handle joint inference in complex systems made of several physics models. Conventional inversion routines are applied to measured data to infer the posterior distribution of the free parameters of the models implemented in the framework. We have trained ANNs on a training set made of samples from the prior distribution of the free parameters and the corresponding data calculated with the forward model, so that the trained ANNs constitute a surrogate model of the physics model. The ANNs have been then applied to 2D images measured by an X-ray spectrometer, representing the spectral emission from plasma impurities measured along a fan of lines of sight covering a major fraction of the plasma cross-section, for the inference of ion temperature profiles and then compared with the conventional inversion routines, showing that they constitute a robust and reliable alternative for real time plasma parameter inference.
Directory of Open Access Journals (Sweden)
Jing Li
2017-01-01
Full Text Available The goal of this study is to improve thermal comfort and indoor air quality with the adaptive network-based fuzzy inference system (ANFIS model and improved particle swarm optimization (PSO algorithm. A method to optimize air conditioning parameters and installation distance is proposed. The methodology is demonstrated through a prototype case, which corresponds to a typical laboratory in colleges and universities. A laboratory model is established, and simulated flow field information is obtained with the CFD software. Subsequently, the ANFIS model is employed instead of the CFD model to predict indoor flow parameters, and the CFD database is utilized to train ANN input-output “metamodels” for the subsequent optimization. With the improved PSO algorithm and the stratified sequence method, the objective functions are optimized. The functions comprise PMV, PPD, and mean age of air. The optimal installation distance is determined with the hemisphere model. Results show that most of the staff obtain a satisfactory degree of thermal comfort and that the proposed method can significantly reduce the cost of building an experimental device. The proposed methodology can be used to determine appropriate air supply parameters and air conditioner installation position for a pleasant and healthy indoor environment.
Visual Inference Specification Methods for Modularized Rulebases. Overview and Integration Proposal
Kluza, Krzysztof; Nalepa, Grzegorz J.; Łysik, Łukasz
2011-01-01
The paper concerns selected rule modularization techniques. Three visual methods for inference specification for modularized rule- bases are described: Drools Flow, BPMN and XTT2. Drools Flow is a popular technology for workflow or process modeling, BPMN is an OMG standard for modeling business processes, and XTT2 is a hierarchical tab- ular system specification method. Because of some limitations of these solutions, several proposals of their integration are given.
Computational Inference Methods for Selective Sweeps Arising in Acute HIV Infection
Leviyang, Sivan
2013-01-01
During the first weeks of human immunodeficiency virus-1 (HIV-1) infection, cytotoxic T-lymphocytes (CTLs) select for multiple escape mutations in the infecting HIV population. In recent years, methods that use escape mutation data to estimate rates of HIV escape have been developed, thereby providing a quantitative framework for exploring HIV escape from CTL response. Current methods for escape-rate inference focus on a specific HIV mutant selected by a single CTL response. However, recent s...
Du, Yuanwei; Guo, Yubin
2015-01-01
The intrinsic mechanism of multimorbidity is difficult to recognize and prediction and diagnosis are difficult to carry out accordingly. Bayesian networks can help to diagnose multimorbidity in health care, but it is difficult to obtain the conditional probability table (CPT) because of the lack of clinically statistical data. Today, expert knowledge and experience are increasingly used in training Bayesian networks in order to help predict or diagnose diseases, but the CPT in Bayesian networks is usually irrational or ineffective for ignoring realistic constraints especially in multimorbidity. In order to solve these problems, an evidence reasoning (ER) approach is employed to extract and fuse inference data from experts using a belief distribution and recursive ER algorithm, based on which evidence reasoning method for constructing conditional probability tables in Bayesian network of multimorbidity is presented step by step. A multimorbidity numerical example is used to demonstrate the method and prove its feasibility and application. Bayesian network can be determined as long as the inference assessment is inferred by each expert according to his/her knowledge or experience. Our method is more effective than existing methods for extracting expert inference data accurately and is fused effectively for constructing CPTs in a Bayesian network of multimorbidity.
Directory of Open Access Journals (Sweden)
Hossein Zare
Full Text Available Transcriptional networks consist of multiple regulatory layers corresponding to the activity of global regulators, specialized repressors and activators as well as proteins and enzymes shaping the DNA template. Such intrinsic complexity makes uncovering connections difficult and it calls for corresponding methodologies, which are adapted to the available data. Here we present a new computational method that predicts interactions between transcription factors and target genes using compendia of microarray gene expression data and documented interactions between genes and transcription factors. The proposed method, called Kernel Embedding of Regulatory Networks (KEREN, is based on the concept of gene-regulon association, and captures hidden geometric patterns of the network via manifold embedding. We applied KEREN to reconstruct transcription regulatory interactions on a genome-wide scale in the model bacteria Escherichia coli (E. coli. Application of the method not only yielded accurate predictions of verifiable interactions, which outperformed on certain metrics comparable methodologies, but also demonstrated the utility of a geometric approach in the analysis of high-dimensional biological data. We also described possible applications of kernel embedding techniques to other function and network discovery algorithms.
Zare, Hossein; Kaveh, Mostafa; Khodursky, Arkady
2011-01-01
Transcriptional networks consist of multiple regulatory layers corresponding to the activity of global regulators, specialized repressors and activators as well as proteins and enzymes shaping the DNA template. Such intrinsic complexity makes uncovering connections difficult and it calls for corresponding methodologies, which are adapted to the available data. Here we present a new computational method that predicts interactions between transcription factors and target genes using compendia of microarray gene expression data and documented interactions between genes and transcription factors. The proposed method, called Kernel Embedding of Regulatory Networks (KEREN), is based on the concept of gene-regulon association, and captures hidden geometric patterns of the network via manifold embedding. We applied KEREN to reconstruct transcription regulatory interactions on a genome-wide scale in the model bacteria Escherichia coli (E. coli). Application of the method not only yielded accurate predictions of verifiable interactions, which outperformed on certain metrics comparable methodologies, but also demonstrated the utility of a geometric approach in the analysis of high-dimensional biological data. We also described possible applications of kernel embedding techniques to other function and network discovery algorithms.
Cohen, Trevor; Schvaneveldt, Roger; Widdows, Dominic
2010-04-01
The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indirect inference, limiting its ability to scale to large text corpora. In this paper, we evaluate the ability of Random Indexing (RI), a scalable distributional model of word associations, to draw meaningful implicit relationships between terms in general and biomedical language. Proponents of this method have achieved comparable performance to LSA on several cognitive tasks while using a simpler and less computationally demanding method of dimension reduction than LSA employs. In this paper, we demonstrate that the original implementation of RI is ineffective at inferring meaningful indirect connections, and evaluate Reflective Random Indexing (RRI), an iterative variant of the method that is better able to perform indirect inference. RRI is shown to lead to more clearly related indirect connections and to outperform existing RI implementations in the prediction of future direct co-occurrence in the MEDLINE corpus. 2009 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Lanay eTierney
2012-03-01
Full Text Available The ability to adapt to diverse micro-environmental challenges encountered within a host is of pivotal importance to the opportunistic fungal pathogen C. albicans. We have quantified C.albicans and M. musculus gene expression dynamics during phagocytosis by dendritic cells in a genome-wide, time-resolved analysis using simultaneous RNA-seq. A robust network inference map was generated from this dataset using NetGenerator, predicting novel interactions between the host and the pathogen. We experimentally verified predicted interdependent sub-networkscomprising Hap3 in C. albicans, and Ptx3 and Mta2 in M. musculus. Remarkably, binding of recombinant Ptx3 to the C. albicans cell wall was found to regulate the expression of fungal Hap3 target genes as predicted by the network inference model. Pre-incubation of C. albicans with recombinant Ptx3 significantly altered the expression of Mta2 target cytokines such as IL-2 and IL-4 in a Hap3-dependent manner, further suggesting a role for Mta2 in host-pathogen interplay as predicted in the network inference model. We propose an integrated model for the functionality of these sub-networks during fungal invasion of immune cells, according to which binding of Ptx3 to the C. albicans cell wall induces remodelling via fungal Hap3 target genes, thereby altering the immune response to the pathogen. We show the applicability of network inference to predict interactions between host-pathogen pairs, demonstrating the usefulness of this systems biology approach to decipher mechanisms of microbial pathogenesis.
Probabilistic Methods for the Inference of Selection and Demography from Ancient Human Genomes
Racimo, Fernando
2016-01-01
Recently developed technologies for the recovery and sequencing of ancient DNA have generated an explosion of paleogenomic data in the last five years. In particular, human paleogenomics has become a thriving field for understanding evolutionary patterns of different hominin groups over time. However, there is still a dearth of statistical tools that can allow biologists to discern meaningful patterns from ancient genomes. Here, I present three methods designed for inferring past demographic ...
A novel prediction method for back pressure based on fuzzy inference theory
Chen, Guanghua; Zhang, Kunting; Qi, Hongyuan; Nan, Bingshen
2017-01-01
In order to solve the problem of back pressure set unreasonable in direct air-cooling unit, a back-pressure-fuzzy-inference machine is established in this paper, of which the environmental temperature and wind speed are the inputs, and the optimal back pressure is the output. The feasibility of the novel method is verified by simulation and experimental results, and the accuracy of back pressure fuzzy prediction can satisfy the operating requirements.
Inferring cell type innovations by phylogenetic methods-concepts, methods, and limitations.
Kin, Koryu
2015-12-01
Multicellular organisms are composed of distinct cell types that have specific roles in the body. Each cell type is a product of two kinds of historical processes-development and evolution. Although the concept of a cell type is difficult to define, the cell type concept based on the idea of the core regulatory network (CRN), a gene regulatory network that determines the identity of a cell type, illustrates the essential aspects of the cell type concept. The first step toward elucidating cell type evolution is to reconstruct the evolutionary relationships of cell types, or the cell type tree. The sister cell type model assumes that a new cell type evolves through divergence from a multifunctional ancestral cell type, creating tree-like evolutionary relationships between cell types. The process of generating a cell type tree can also be understood as the sequential addition of a new branching point on an ancestral cell differentiation hierarchy in evolution. A cell type tree thus represents an intertwined history of cell type evolution and development. Cell type trees can be reconstructed from high-throughput sequencing data, and the reconstruction of a cell type tree leads to the discovery of genes that are functionally important for a cell type. Although many issues including the lack of cross-species comparisons and the lack of a proper model for cell type evolution remain, the study of the origin of a new cell type using phylogenetic methods offers a promising new research avenue in developmental evolution. J. Exp. Zool. (Mol. Dev. Evol.) 324B: 653-661, 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Quantitative Method for Network Security Situation Based on Attack Prediction
Directory of Open Access Journals (Sweden)
Hao Hu
2017-01-01
Full Text Available Multistep attack prediction and security situation awareness are two big challenges for network administrators because future is generally unknown. In recent years, many investigations have been made. However, they are not sufficient. To improve the comprehensiveness of prediction, in this paper, we quantitatively convert attack threat into security situation. Actually, two algorithms are proposed, namely, attack prediction algorithm using dynamic Bayesian attack graph and security situation quantification algorithm based on attack prediction. The first algorithm aims to provide more abundant information of future attack behaviors by simulating incremental network penetration. Through timely evaluating the attack capacity of intruder and defense strategies of defender, the likely attack goal, path, and probability and time-cost are predicted dynamically along with the ongoing security events. Furthermore, in combination with the common vulnerability scoring system (CVSS metric and network assets information, the second algorithm quantifies the concealed attack threat into the surfaced security risk from two levels: host and network. Examples show that our method is feasible and flexible for the attack-defense adversarial network environment, which benefits the administrator to infer the security situation in advance and prerepair the critical compromised hosts to maintain normal network communication.
Inferring the interplay between network structure and market effects in Bitcoin
Kondor, Dániel; Csabai, István; Szüle, János; Pósfai, Márton; Vattay, Gábor
2014-12-01
A main focus in economics research is understanding the time series of prices of goods and assets. While statistical models using only the properties of the time series itself have been successful in many aspects, we expect to gain a better understanding of the phenomena involved if we can model the underlying system of interacting agents. In this article, we consider the history of Bitcoin, a novel digital currency system, for which the complete list of transactions is available for analysis. Using this dataset, we reconstruct the transaction network between users and analyze changes in the structure of the subgraph induced by the most active users. Our approach is based on the unsupervised identification of important features of the time variation of the network. Applying the widely used method of Principal Component Analysis to the matrix constructed from snapshots of the network at different times, we are able to show how structural changes in the network accompany significant changes in the exchange price of bitcoins.
Maji, PP; Mullins, R.
2017-01-01
Deep convolutional neural networks (CNNs), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks, at the expense of high computational complexity, limiting their deployability. In modern CNNs, convolutional layers mostly consume 90% of the processing time during a forward inference and acceleration of these layers are of great research and commercial interest. In this paper, we examine the effects of co-optimizing intern...
Inferring monopartite projections of bipartite networks: an entropy-based approach
Saracco, Fabio; Straka, Mika J.; Di Clemente, Riccardo; Gabrielli, Andrea; Caldarelli, Guido; Squartini, Tiziano
2017-05-01
Bipartite networks are currently regarded as providing a major insight into the organization of many real-world systems, unveiling the mechanisms driving the interactions occurring between distinct groups of nodes. One of the most important issues encountered when modeling bipartite networks is devising a way to obtain a (monopartite) projection on the layer of interest, which preserves as much as possible the information encoded into the original bipartite structure. In the present paper we propose an algorithm to obtain statistically-validated projections of bipartite networks, according to which any two nodes sharing a statistically-significant number of neighbors are linked. Since assessing the statistical significance of nodes similarity requires a proper statistical benchmark, here we consider a set of four null models, defined within the exponential random graph framework. Our algorithm outputs a matrix of link-specific p-values, from which a validated projection is straightforwardly obtainable, upon running a multiple hypothesis testing procedure. Finally, we test our method on an economic network (i.e. the countries-products World Trade Web representation) and a social network (i.e. MovieLens, collecting the users’ ratings of a list of movies). In both cases non-trivial communities are detected: while projecting the World Trade Web on the countries layer reveals modules of similarly-industrialized nations, projecting it on the products layer allows communities characterized by an increasing level of complexity to be detected; in the second case, projecting MovieLens on the films layer allows clusters of movies whose affinity cannot be fully accounted for by genre similarity to be individuated.
Directory of Open Access Journals (Sweden)
Hue-Yu Wang
Full Text Available BACKGROUND: An adaptive-network-based fuzzy inference system (ANFIS was compared with an artificial neural network (ANN in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C, pH level (5.5 to 7.5, sodium chloride level (0.25% to 6.25% and sodium nitrite level (0 to 200 ppm on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. METHODS: THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED BY COMPARING THEIR PREDICTION RESULTS WITH ACTUAL DATA: mean absolute percentage error (MAPE, root mean square error (RMSE, standard error of prediction percentage (SEP, bias factor (Bf, accuracy factor (Af, and absolute fraction of variance (R (2. Graphical plots were also used for model comparison. CONCLUSIONS: The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions.
Predictive Distribution of the Dirichlet Mixture Model by the Local Variational Inference Method
DEFF Research Database (Denmark)
Ma, Zhanyu; Leijon, Arne; Tan, Zheng-Hua
2014-01-01
the predictive likelihood of the new upcoming data, especially when the amount of training data is small. The Bayesian estimation of a Dirichlet mixture model (DMM) is, in general, not analytically tractable. In our previous work, we have proposed a global variational inference-based method for approximately......In Bayesian analysis of a statistical model, the predictive distribution is obtained by marginalizing over the parameters with their posterior distributions. Compared to the frequently used point estimate plug-in method, the predictive distribution leads to a more reliable result in calculating...
Sampling of temporal networks: Methods and biases
Rocha, Luis E. C.; Masuda, Naoki; Holme, Petter
2017-11-01
Temporal networks have been increasingly used to model a diversity of systems that evolve in time; for example, human contact structures over which dynamic processes such as epidemics take place. A fundamental aspect of real-life networks is that they are sampled within temporal and spatial frames. Furthermore, one might wish to subsample networks to reduce their size for better visualization or to perform computationally intensive simulations. The sampling method may affect the network structure and thus caution is necessary to generalize results based on samples. In this paper, we study four sampling strategies applied to a variety of real-life temporal networks. We quantify the biases generated by each sampling strategy on a number of relevant statistics such as link activity, temporal paths and epidemic spread. We find that some biases are common in a variety of networks and statistics, but one strategy, uniform sampling of nodes, shows improved performance in most scenarios. Given the particularities of temporal network data and the variety of network structures, we recommend that the choice of sampling methods be problem oriented to minimize the potential biases for the specific research questions on hand. Our results help researchers to better design network data collection protocols and to understand the limitations of sampled temporal network data.
Constructing an Intelligent Patent Network Analysis Method
Chao-Chan Wu; Ching-Bang Yao
2012-01-01
Patent network analysis, an advanced method of patent analysis, is a useful tool for technology management. This method visually displays all the relationships among the patents and enables the analysts to intuitively comprehend the overview of a set of patents in the field of the technology being studied. Although patent network analysis possesses relative advantages different from traditional methods of patent analysis, it is subject to several crucial limitations. To overcome the drawbacks...
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Directory of Open Access Journals (Sweden)
Jung-Shyr Wu
2012-01-01
Full Text Available CAC (Call Admission Control plays a significant role in providing QoS (Quality of Service in mobile wireless networks. In addition to much research that focuses on modified Mobile IP to get better efficient handover performance, CAC should be introduced to Mobile IP-based network to guarantee the QoS for users. In this paper, we propose a CAC scheme which incorporates multiple traffic types and adjusts the admission threshold dynamically using fuzzy control logic to achieve better usage of resources. The method can provide QoS in Mobile IPv6 networks with few modifications on MAP (Mobility Anchor Point functionality and slight change in BU (Binding Update message formats. According to the simulation results, the proposed scheme presents good performance of voice and video traffic at the expenses of poor performance on data traffic. It is evident that these CAC schemes can reduce the probability of the handoff dropping and the cell overload and limit the probability of the new call blocking.
Directory of Open Access Journals (Sweden)
Yifei Zhang
Full Text Available The diversity of microbiota is best explored by understanding the phylogenetic structure of the microbial communities. Traditionally, sequence alignment has been used for phylogenetic inference. However, alignment-based approaches come with significant challenges and limitations when massive amounts of data are analyzed. In the recent decade, alignment-free approaches have enabled genome-scale phylogenetic inference. Here we evaluate three alignment-free methods: ACS, CVTree, and Kr for phylogenetic inference with 16s rRNA gene data. We use a taxonomic gold standard to compare the accuracy of alignment-free phylogenetic inference with that of common microbiome-wide phylogenetic inference pipelines based on PyNAST and MUSCLE alignments with FastTree and RAxML. We re-simulate fecal communities from Human Microbiome Project data to evaluate the performance of the methods on datasets with properties of real data. Our comparisons show that alignment-free methods are not inferior to alignment-based methods in giving accurate and robust phylogenic trees. Moreover, consensus ensembles of alignment-free phylogenies are superior to those built from alignment-based methods in their ability to highlight community differences in low power settings. In addition, the overall running times of alignment-based and alignment-free phylogenetic inference are comparable. Taken together our empirical results suggest that alignment-free methods provide a viable approach for microbiome-wide phylogenetic inference.
Using Characteristics Method to Infer Sound Speed in Nonsymmetric Impact and Release Experiment
Hu, Xiaomian; Pan, Hao; Wu, Zihui
2017-06-01
Sound speed is important to high velocity impact phenomena because it is used to deduce the shear moduli, strength and phase transition of materials at high pressure. Historically the sound speed analysis methods cannot infer the right results from the velocity-time history of a windowed-surface in the nonsymmetric impact and release experiment due to impedance mismatch between a flyer, sample and window. A characteristics method has been modified to account for the effect of the flyer/sample and sample/window interactions, thus it can be applied to the nonsymmetric impact and release experiment with only one depth of material. Synthetic analyses of the nonsymmetric impact suggest that this method can give accurate results including sound speed-particle velocity and release path at high pressure, moreover, this method also do not need to know the form of equations of state (EOS) and constitutive model of the sample.These features facilitate applying this method to infer sound speed from the velocity profile of nonsymmetric impact experiments.
Inferring task-related networks using independent component analysis in magnetoencephalography.
Luckhoo, H; Hale, J R; Stokes, M G; Nobre, A C; Morris, P G; Brookes, M J; Woolrich, M W
2012-08-01
A novel framework for analysing task-positive data in magnetoencephalography (MEG) is presented that can identify task-related networks. Techniques that combine beamforming, the Hilbert transform and temporal independent component analysis (ICA) have recently been applied to resting-state MEG data and have been shown to extract resting-state networks similar to those found in fMRI. Here we extend this approach in two ways. First, we systematically investigate optimisation of time-frequency windows for connectivity measurement. This is achieved by estimating the distribution of functional connectivity scores between nodes of known resting-state networks and contrasting it with a distribution of artefactual scores that are entirely due to spatial leakage caused by the inverse problem. We find that functional connectivity, both in the resting-state and during a cognitive task, is best estimated via correlations in the oscillatory envelope in the 8-20 Hz frequency range, temporally down-sampled with windows of 1-4s. Second, we combine ICA with the general linear model (GLM) to incorporate knowledge of task structure into our connectivity analysis. The combination of ICA with the GLM helps overcome problems of these techniques when used independently: namely, the interpretation and separation of interesting independent components from those that represent noise in ICA and the correction for multiple comparisons when applying the GLM. We demonstrate the approach on a 2-back working memory task and show that this novel analysis framework is able to elucidate the functional networks involved in the task beyond that which is achieved using the GLM alone. We find evidence of localised task-related activity in the area of the hippocampus, which is difficult to detect reliably using standard methods. Task-positive ICA, coupled with the GLM, has the potential to be a powerful tool in the analysis of MEG data. Copyright © 2012 Elsevier Inc. All rights reserved.
Constructing an Intelligent Patent Network Analysis Method
Directory of Open Access Journals (Sweden)
Chao-Chan Wu
2012-11-01
Full Text Available Patent network analysis, an advanced method of patent analysis, is a useful tool for technology management. This method visually displays all the relationships among the patents and enables the analysts to intuitively comprehend the overview of a set of patents in the field of the technology being studied. Although patent network analysis possesses relative advantages different from traditional methods of patent analysis, it is subject to several crucial limitations. To overcome the drawbacks of the current method, this study proposes a novel patent analysis method, called the intelligent patent network analysis method, to make a visual network with great precision. Based on artificial intelligence techniques, the proposed method provides an automated procedure for searching patent documents, extracting patent keywords, and determining the weight of each patent keyword in order to generate a sophisticated visualization of the patent network. This study proposes a detailed procedure for generating an intelligent patent network that is helpful for improving the efficiency and quality of patent analysis. Furthermore, patents in the field of Carbon Nanotube Backlight Unit (CNT-BLU were analyzed to verify the utility of the proposed method.
Directory of Open Access Journals (Sweden)
Cresten B Mansfeldt
Full Text Available The interpretation of high-throughput gene expression data for non-model microorganisms remains obscured because of the high fraction of hypothetical genes and the limited number of methods for the robust inference of gene networks. Therefore, to elucidate gene-gene and gene-condition linkages in the bioremediation-important genus Dehalococcoides, we applied a Bayesian inference strategy called Reverse Engineering/Forward Simulation (REFS™ on transcriptomic data collected from two organohalide-respiring communities containing different Dehalococcoides mccartyi strains: the Cornell University mixed community D2 and the commercially available KB-1® bioaugmentation culture. In total, 49 and 24 microarray datasets were included in the REFS™ analysis to generate an ensemble of 1,000 networks for the Dehalococcoides population in the Cornell D2 and KB-1® culture, respectively. Considering only linkages that appeared in the consensus network for each culture (exceeding the determined frequency cutoff of ≥ 60%, the resulting Cornell D2 and KB-1® consensus networks maintained 1,105 nodes (genes or conditions with 974 edges and 1,714 nodes with 1,455 edges, respectively. These consensus networks captured multiple strong and biologically informative relationships. One of the main highlighted relationships shared between these two cultures was a direct edge between the transcript encoding for the major reductive dehalogenase (tceA (D2 or vcrA (KB-1® and the transcript for the putative S-layer cell wall protein (DET1407 (D2 or KB1_1396 (KB-1®. Additionally, transcripts for two key oxidoreductases (a [Ni Fe] hydrogenase, Hup, and a protein with similarity to a formate dehydrogenase, "Fdh" were strongly linked, generalizing a strong relationship noted previously for Dehalococcoides mccartyi strain 195 to multiple strains of Dehalococcoides. Notably, the pangenome array utilized when monitoring the KB-1® culture was capable of resolving signals from
Complex networks principles, methods and applications
Latora, Vito; Russo, Giovanni
2017-01-01
Networks constitute the backbone of complex systems, from the human brain to computer communications, transport infrastructures to online social systems and metabolic reactions to financial markets. Characterising their structure improves our understanding of the physical, biological, economic and social phenomena that shape our world. Rigorous and thorough, this textbook presents a detailed overview of the new theory and methods of network science. Covering algorithms for graph exploration, node ranking and network generation, among the others, the book allows students to experiment with network models and real-world data sets, providing them with a deep understanding of the basics of network theory and its practical applications. Systems of growing complexity are examined in detail, challenging students to increase their level of skill. An engaging presentation of the important principles of network science makes this the perfect reference for researchers and undergraduate and graduate students in physics, ...
Binary Classification Method of Social Network Users
Directory of Open Access Journals (Sweden)
I. A. Poryadin
2017-01-01
Full Text Available The subject of research is a binary classification method of social network users based on the data analysis they have placed. Relevance of the task to gain information about a person by examining the content of his/her pages in social networks is exemplified. The most common approach to its solution is a visual browsing. The order of the regional authority in our country illustrates that its using in school education is needed. The article shows restrictions on the visual browsing of pupil’s pages in social networks as a tool for the teacher and the school psychologist and justifies that a process of social network users’ data analysis should be automated. Explores publications, which describe such data acquisition, processing, and analysis methods and considers their advantages and disadvantages. The article also gives arguments to support a proposal to study the classification method of social network users. One such method is credit scoring, which is used in banks and credit institutions to assess the solvency of clients. Based on the high efficiency of the method there is a proposal for significant expansion of its using in other areas of society. The possibility to use logistic regression as the mathematical apparatus of the proposed method of binary classification has been justified. Such an approach enables taking into account the different types of data extracted from social networks. Among them: the personal user data, information about hobbies, friends, graphic and text information, behaviour characteristics. The article describes a number of existing methods of data transformation that can be applied to solve the problem. An experiment of binary gender-based classification of social network users is described. A logistic model obtained for this example includes multiple logical variables obtained by transforming the user surnames. This experiment confirms the feasibility of the proposed method. Further work is to define a system
Computational inference methods for selective sweeps arising in acute HIV infection.
Leviyang, Sivan
2013-07-01
During the first weeks of human immunodeficiency virus-1 (HIV-1) infection, cytotoxic T-lymphocytes (CTLs) select for multiple escape mutations in the infecting HIV population. In recent years, methods that use escape mutation data to estimate rates of HIV escape have been developed, thereby providing a quantitative framework for exploring HIV escape from CTL response. Current methods for escape-rate inference focus on a specific HIV mutant selected by a single CTL response. However, recent studies have shown that during the first weeks of infection, CTL responses occur at one to three epitopes and HIV escape occurs through complex mutation pathways. Consequently, HIV escape from CTL response forms a complex, selective sweep that is difficult to analyze. In this work, we develop a model of initial infection, based on the well-known standard model, that allows for a description of multi-epitope response and the complex mutation pathways of HIV escape. Under this model, we develop Bayesian and hypothesis-test inference methods that allow us to analyze and estimate HIV escape rates. The methods are applied to two HIV patient data sets, concretely demonstrating the utility of our approach.
Inferring the photometric and size evolution of galaxies from image simulations. I. Method
Carassou, Sébastien; de Lapparent, Valérie; Bertin, Emmanuel; Le Borgne, Damien
2017-09-01
Context. Current constraints on models of galaxy evolution rely on morphometric catalogs extracted from multi-band photometric surveys. However, these catalogs are altered by selection effects that are difficult to model, that correlate in non trivial ways, and that can lead to contradictory predictions if not taken into account carefully. Aims: To address this issue, we have developed a new approach combining parametric Bayesian indirect likelihood (pBIL) techniques and empirical modeling with realistic image simulations that reproduce a large fraction of these selection effects. This allows us to perform a direct comparison between observed and simulated images and to infer robust constraints on model parameters. Methods: We use a semi-empirical forward model to generate a distribution of mock galaxies from a set of physical parameters. These galaxies are passed through an image simulator reproducing the instrumental characteristics of any survey and are then extracted in the same way as the observed data. The discrepancy between the simulated and observed data is quantified, and minimized with a custom sampling process based on adaptive Markov chain Monte Carlo methods. Results: Using synthetic data matching most of the properties of a Canada-France-Hawaii Telescope Legacy Survey Deep field, we demonstrate the robustness and internal consistency of our approach by inferring the parameters governing the size and luminosity functions and their evolutions for different realistic populations of galaxies. We also compare the results of our approach with those obtained from the classical spectral energy distribution fitting and photometric redshift approach. Conclusions: Our pipeline infers efficiently the luminosity and size distribution and evolution parameters with a very limited number of observables (three photometric bands). When compared to SED fitting based on the same set of observables, our method yields results that are more accurate and free from
Advanced fault diagnosis methods in molecular networks.
Habibi, Iman; Emamian, Effat S; Abdi, Ali
2014-01-01
Analysis of the failure of cell signaling networks is an important topic in systems biology and has applications in target discovery and drug development. In this paper, some advanced methods for fault diagnosis in signaling networks are developed and then applied to a caspase network and an SHP2 network. The goal is to understand how, and to what extent, the dysfunction of molecules in a network contributes to the failure of the entire network. Network dysfunction (failure) is defined as failure to produce the expected outputs in response to the input signals. Vulnerability level of a molecule is defined as the probability of the network failure, when the molecule is dysfunctional. In this study, a method to calculate the vulnerability level of single molecules for different combinations of input signals is developed. Furthermore, a more complex yet biologically meaningful method for calculating the multi-fault vulnerability levels is suggested, in which two or more molecules are simultaneously dysfunctional. Finally, a method is developed for fault diagnosis of networks based on a ternary logic model, which considers three activity levels for a molecule instead of the previously published binary logic model, and provides equations for the vulnerabilities of molecules in a ternary framework. Multi-fault analysis shows that the pairs of molecules with high vulnerability typically include a highly vulnerable molecule identified by the single fault analysis. The ternary fault analysis for the caspase network shows that predictions obtained using the more complex ternary model are about the same as the predictions of the simpler binary approach. This study suggests that by increasing the number of activity levels the complexity of the model grows; however, the predictive power of the ternary model does not appear to be increased proportionally.
Inferences on weather extremes and weather-related disasters: a review of statistical methods
Directory of Open Access Journals (Sweden)
H. Visser
2012-02-01
Full Text Available The study of weather extremes and their impacts, such as weather-related disasters, plays an important role in research of climate change. Due to the great societal consequences of extremes – historically, now and in the future – the peer-reviewed literature on this theme has been growing enormously since the 1980s. Data sources have a wide origin, from century-long climate reconstructions from tree rings to relatively short (30 to 60 yr databases with disaster statistics and human impacts.
When scanning peer-reviewed literature on weather extremes and its impacts, it is noticeable that many different methods are used to make inferences. However, discussions on these methods are rare. Such discussions are important since a particular methodological choice might substantially influence the inferences made. A calculation of a return period of once in 500 yr, based on a normal distribution will deviate from that based on a Gumbel distribution. And the particular choice between a linear or a flexible trend model might influence inferences as well.
In this article, a concise overview of statistical methods applied in the field of weather extremes and weather-related disasters is given. Methods have been evaluated as to stationarity assumptions, the choice for specific probability density functions (PDFs and the availability of uncertainty information. As for stationarity assumptions, the outcome was that good testing is essential. Inferences on extremes may be wrong if data are assumed stationary while they are not. The same holds for the block-stationarity assumption. As for PDF choices it was found that often more than one PDF shape fits to the same data. From a simulation study the conclusion can be drawn that both the generalized extreme value (GEV distribution and the log-normal PDF fit very well to a variety of indicators. The application of the normal and Gumbel distributions is more limited. As for uncertainty, it is
Directory of Open Access Journals (Sweden)
Joshi Anagha
2009-05-01
Full Text Available Abstract Background A myriad of methods to reverse-engineer transcriptional regulatory networks have been developed in recent years. Direct methods directly reconstruct a network of pairwise regulatory interactions while module-based methods predict a set of regulators for modules of coexpressed genes treated as a single unit. To date, there has been no systematic comparison of the relative strengths and weaknesses of both types of methods. Results We have compared a recently developed module-based algorithm, LeMoNe (Learning Module Networks, to a mutual information based direct algorithm, CLR (Context Likelihood of Relatedness, using benchmark expression data and databases of known transcriptional regulatory interactions for Escherichia coli and Saccharomyces cerevisiae. A global comparison using recall versus precision curves hides the topologically distinct nature of the inferred networks and is not informative about the specific subtasks for which each method is most suited. Analysis of the degree distributions and a regulator specific comparison show that CLR is 'regulator-centric', making true predictions for a higher number of regulators, while LeMoNe is 'target-centric', recovering a higher number of known targets for fewer regulators, with limited overlap in the predicted interactions between both methods. Detailed biological examples in E. coli and S. cerevisiae are used to illustrate these differences and to prove that each method is able to infer parts of the network where the other fails. Biological validation of the inferred networks cautions against over-interpreting recall and precision values computed using incomplete reference networks. Conclusion Our results indicate that module-based and direct methods retrieve largely distinct parts of the underlying transcriptional regulatory networks. The choice of algorithm should therefore be based on the particular biological problem of interest and not on global metrics which cannot be
Cycle-Based Cluster Variational Method for Direct and Inverse Inference
Furtlehner, Cyril; Decelle, Aurélien
2016-08-01
Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.
Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.
Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh
2017-01-01
The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. We built a Bayesian network combining random process and greedy search by using Genetic Analysis Workshop 14 (GAW14) dataset to establish EBN of SNPs. Then we predicted the association between SNPs and alcoholism by determining Bayes' prior probability. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As many SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Application of fuzzy inference system by Sugeno method on estimating of salt production
Yulianto, Tony; Komariyah, Siti; Ulfaniyah, Nurita
2017-08-01
Salt is one of the most important needs in everyday life. Making traditional salt largely is done by smallholder farmers in addition by manufacturers of industrial salt. factors that affect the production of salt include seawater, soil, water influence and weather conditions including rainfall wind speed and solar radiation or long dry erratic, these conditions obviously affect the salt farmers that will affect the production quantities of salt produced by salt farmers. In this study, the fuzzy logic method is applied to Sugeno fuzzy inference systems to estimate the production of salt by variables - variables that affect it. This study aims to estimate how much production by applying fuzzy inference systems zero-order Sugeno method based on the variable wind speed, solar radiation, rainfall and the amount of production. Retrieval of data obtained from the Air Quality Meteorology and Geophysics. salt farmers in Pamekasan District of Pademawu Village Majungan. Data taken within 2 years per week from June to December of 2014 and 2015. The Sugeno fuzzy logic model in this study using output (consequent) in the form of equation constants (Sugeno models Order zero). Apparently from the research results obtained by the error value most low at 0.0917, so it can be said to be close to zero.
A Neural Network Approach to Infer Optical Depth of Thick Ice Clouds at Night
Minnis, P.; Hong, G.; Sun-Mack, S.; Chen, Yan; Smith, W. L., Jr.
2016-01-01
One of the roadblocks to continuously monitoring cloud properties is the tendency of clouds to become optically black at cloud optical depths (COD) of 6 or less. This constraint dramatically reduces the quantitative information content at night. A recent study found that because of their diffuse nature, ice clouds remain optically gray, to some extent, up to COD of 100 at certain wavelengths. Taking advantage of this weak dependency and the availability of COD retrievals from CloudSat, an artificial neural network algorithm was developed to estimate COD values up to 70 from common satellite imager infrared channels. The method was trained using matched 2007 CloudSat and Aqua MODIS data and is tested using similar data from 2008. The results show a significant improvement over the use of default values at night with high correlation. This paper summarizes the results and suggests paths for future improvement.
Artificial neural network intelligent method for prediction
Trifonov, Roumen; Yoshinov, Radoslav; Pavlova, Galya; Tsochev, Georgi
2017-09-01
Accounting and financial classification and prediction problems are high challenge and researchers use different methods to solve them. Methods and instruments for short time prediction of financial operations using artificial neural network are considered. The methods, used for prediction of financial data as well as the developed forecasting system with neural network are described in the paper. The architecture of a neural network used four different technical indicators, which are based on the raw data and the current day of the week is presented. The network developed is used for forecasting movement of stock prices one day ahead and consists of an input layer, one hidden layer and an output layer. The training method is algorithm with back propagation of the error. The main advantage of the developed system is self-determination of the optimal topology of neural network, due to which it becomes flexible and more precise The proposed system with neural network is universal and can be applied to various financial instruments using only basic technical indicators as input data.
Perspectives of Probabilistic Inferences: Reinforcement Learning and an Adaptive Network Compared
Rieskamp, Jorg
2006-01-01
The assumption that people possess a strategy repertoire for inferences has been raised repeatedly. The strategy selection learning theory specifies how people select strategies from this repertoire. The theory assumes that individuals select strategies proportional to their subjective expectations of how well the strategies solve particular…
NETWORK ECONOMY INNOVATIVE POTENTIAL EVALUATION METHOD
Directory of Open Access Journals (Sweden)
E. V. Loguinova
2011-01-01
Full Text Available Existing methodological approaches to assessment of the innovation potential having been analyzed, a network system innovative potential identification and characterization method is proposed that makes it possible to assess the potential’s qualitative and quantitative components and to determine their consistency with national innovative system formation and development objectives. Four stages are recommended and determined to assess the network economy innovative potential. Main structural elements of the network economy innovative potential are the resource, institutional, infrastructural and resulting factor totalities.
2013-01-01
Background We consider the user task of designing clinical trial protocols and propose a method that discovers and outputs the most appropriate eligibility criteria from a potentially huge set of candidates. Each document d in our collection D is a clinical trial protocol which itself contains a set of eligibility criteria. Given a small set of sample documentsD′,|D′|≪|D|, a user has initially identified as relevant e.g., via a user query interface, our scoring method automatically suggests eligibility criteria from D, D ⊃ D', by ranking them according to how appropriate they are to the clinical trial protocol currently being designed. The appropriateness is measured by the degree to which they are consistent with the user-supplied sample documents D'. Method We propose a novel three-step method called LDALR which views documents as a mixture of latent topics. First, we infer the latent topics in the sample documents using Latent Dirichlet Allocation (LDA). Next, we use logistic regression models to compute the probability that a given candidate criterion belongs to a particular topic. Lastly, we score each criterion by computing its expected value, the probability-weighted sum of the topic proportions inferred from the set of sample documents. Intuitively, the greater the probability that a candidate criterion belongs to the topics that are dominant in the samples, the higher its expected value or score. Results Our experiments have shown that LDALR is 8 and 9 times better (resp., for inclusion and exclusion criteria) than randomly choosing from a set of candidates obtained from relevant documents. In user simulation experiments using LDALR, we were able to automatically construct eligibility criteria that are on the average 75% and 70% (resp., for inclusion and exclusion criteria) similar to the correct eligibility criteria. Conclusions We have proposed LDALR, a practical method for discovering and inferring appropriate eligibility criteria in clinical
Homotopy methods for counting reaction network equilibria
Craciun, Gheorghe; Helton, J. William; Williams, Ruth J
2007-01-01
Dynamical system models of complex biochemical reaction networks are usually high-dimensional, nonlinear, and contain many unknown parameters. In some cases the reaction network structure dictates that positive equilibria must be unique for all values of the parameters in the model. In other cases multiple equilibria exist if and only if special relationships between these parameters are satisfied. We describe methods based on homotopy invariance of degree which allow us to determine the numb...
Jockusch, Elizabeth L; Martínez-Solano, Iñigo; Timpe, Elizabeth K
2015-01-01
Species tree methods are now widely used to infer the relationships among species from multilocus data sets. Many methods have been developed, which differ in whether gene and species trees are estimated simultaneously or sequentially, and in how gene trees are used to infer the species tree. While these methods perform well on simulated data, less is known about what impacts their performance on empirical data. We used a data set including five nuclear genes and one mitochondrial gene for 22 species of Batrachoseps to compare the effects of method of analysis, within-species sampling and gene sampling on species tree inferences. For this data set, the choice of inference method had the largest effect on the species tree topology. Exclusion of individual loci had large effects in *BEAST and STEM, but not in MP-EST. Different loci carried the greatest leverage in these different methods, showing that the causes of their disproportionate effects differ. Even though substantial information was present in the nuclear loci, the mitochondrial gene dominated the *BEAST species tree. This leverage is inherent to the mtDNA locus and results from its high variation and lower assumed ploidy. This mtDNA leverage may be problematic when mtDNA has undergone introgression, as is likely in this data set. By contrast, the leverage of RAG1 in STEM analyses does not reflect properties inherent to the locus, but rather results from a gene tree that is strongly discordant with all others, and is best explained by introgression between distantly related species. Within-species sampling was also important, especially in *BEAST analyses, as shown by differences in tree topology across 100 subsampled data sets. Despite the sensitivity of the species tree methods to multiple factors, five species groups, the relationships among these, and some relationships within them, are generally consistently resolved for Batrachoseps. © The Author(s) 2014. Published by Oxford University Press, on
General Methods for Evolutionary Quantitative Genetic Inference from Generalized Mixed Models.
de Villemereuil, Pierre; Schielzeth, Holger; Nakagawa, Shinichi; Morrissey, Michael
2016-11-01
Methods for inference and interpretation of evolutionary quantitative genetic parameters, and for prediction of the response to selection, are best developed for traits with normal distributions. Many traits of evolutionary interest, including many life history and behavioral traits, have inherently nonnormal distributions. The generalized linear mixed model (GLMM) framework has become a widely used tool for estimating quantitative genetic parameters for nonnormal traits. However, whereas GLMMs provide inference on a statistically convenient latent scale, it is often desirable to express quantitative genetic parameters on the scale upon which traits are measured. The parameters of fitted GLMMs, despite being on a latent scale, fully determine all quantities of potential interest on the scale on which traits are expressed. We provide expressions for deriving each of such quantities, including population means, phenotypic (co)variances, variance components including additive genetic (co)variances, and parameters such as heritability. We demonstrate that fixed effects have a strong impact on those parameters and show how to deal with this by averaging or integrating over fixed effects. The expressions require integration of quantities determined by the link function, over distributions of latent values. In general cases, the required integrals must be solved numerically, but efficient methods are available and we provide an implementation in an R package, QGglmm. We show that known formulas for quantities such as heritability of traits with binomial and Poisson distributions are special cases of our expressions. Additionally, we show how fitted GLMM can be incorporated into existing methods for predicting evolutionary trajectories. We demonstrate the accuracy of the resulting method for evolutionary prediction by simulation and apply our approach to data from a wild pedigreed vertebrate population. Copyright © 2016 de Villemereuil et al.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Garcia-Huidobro, Diego; Michael Oakes, J
2017-04-01
Randomised controlled trials (RCTs) are typically viewed as the gold standard for causal inference. This is because effects of interest can be identified with the fewest assumptions, especially imbalance in background characteristics. Yet because conducting RCTs are expensive, time consuming and sometimes unethical, observational studies are frequently used to study causal associations. In these studies, imbalance, or confounding, is usually controlled with multiple regression, which entails strong assumptions. The purpose of this manuscript is to describe strengths and weaknesses of several methods to control for confounding in observational studies, and to demonstrate their use in cross-sectional dataset that use patient registration data from the Juan Pablo II Primary Care Clinic in La Pintana-Chile. The dataset contains responses from 5855 families who provided complete information on family socio-demographics, family functioning and health problems among their family members. We employ regression adjustment, stratification, restriction, matching, propensity score matching, standardisation and inverse probability weighting to illustrate the approaches to better causal inference in non-experimental data and compare results. By applying study design and data analysis techniques that control for confounding in different ways than regression adjustment, researchers may strengthen the scientific relevance of observational studies. © 2016 International Union of Psychological Science.
Directory of Open Access Journals (Sweden)
Tingting Zheng
2017-09-01
Full Text Available Background: A range of computational methods that rely on the analysis of genome-wide expression datasets have been developed and successfully used for drug repositioning. The success of these methods is based on the hypothesis that introducing a factor (in this case, a drug molecule that could reverse the disease gene expression signature will lead to a therapeutic effect. However, it has also been shown that globally reversing the disease expression signature is not a prerequisite for drug activity. On the other hand, the basic idea of significant anti-correlation in expression profiles could have great value for establishing diet-disease associations and could provide new insights into the role of dietary interventions in disease.Methods: We performed an integrated analysis of publicly available gene expression profiles for foods, diseases and drugs, by calculating pairwise similarity scores for diet and disease gene expression signatures and characterizing their topological features in protein-protein interaction networks.Results: We identified 485 diet-disease pairs where diet could positively influence disease development and 472 pairs where specific diets should be avoided in a disease state. Multiple evidence suggests that orange, whey and coconut fat could be beneficial for psoriasis, lung adenocarcinoma and macular degeneration, respectively. On the other hand, fructose-rich diet should be restricted in patients with chronic intermittent hypoxia and ovarian cancer. Since humans normally do not consume foods in isolation, we also applied different algorithms to predict synergism; as a result, 58 food pairs were predicted. Interestingly, the diets identified as anti-correlated with diseases showed a topological proximity to the disease proteins similar to that of the corresponding drugs.Conclusions: In conclusion, we provide a computational framework for establishing diet-disease associations and additional information on the role of
Smits, Samuel A; Ouverney, Cleber C
2010-10-07
Comparative sequence analysis of the 16S rRNA gene is frequently used to characterize the microbial diversity of environmental samples. However, sequence similarities do not always imply functional or evolutionary relatedness due to many factors, including unequal rates of change and convergence. Thus, relying on top BLASTN hits for phylogenetic studies may misrepresent the diversity of these constituents. Furthermore, attempts to circumvent this issue by including a large number of BLASTN hits per sequence in one tree to explore their relatedness presents other problems. For instance, the multiple sequence alignment will be poor and computationally costly if not relying on manual alignment, and it may be difficult to derive meaningful relationships from the resulting tree. Analyzing sequence relationship networks within collective BLASTN results, however, reveal sequences that are closely related despite low rank. We have developed a web application, Phylometrics, that relies on networks of collective BLASTN results (rather than single BLASTN hits) to facilitate the process of building phylogenetic trees in an automated, high-throughput fashion while offering novel tools to find sequences that are of significant phylogenetic interest with minimal human involvement. The application, which can be installed locally in a laboratory or hosted remotely, utilizes a simple wizard-style format to guide the user through the pipeline without necessitating a background in programming. Furthermore, Phylometrics implements an independent job queuing system that enables users to continue to use the system while jobs are run with little or no degradation in performance. Phylometrics provides a novel data mining method to screen supplied DNA sequences and to identify sequences that are of significant phylogenetic interest using powerful analytical tools. Sequences that are identified as being similar to a number of supplied sequences may provide key insights into their functional
Directory of Open Access Journals (Sweden)
Ignat Drozdov
Full Text Available Small intestinal (SI neuroendocrine tumors (NET are increasing in incidence, however little is known about their biology. High throughput techniques such as inference of gene regulatory networks from microarray experiments can objectively define signaling machinery in this disease. Genome-wide co-expression analysis was used to infer gene relevance network in SI-NETs. The network was confirmed to be non-random, scale-free, and highly modular. Functional analysis of gene co-expression modules revealed processes including 'Nervous system development', 'Immune response', and 'Cell-cycle'. Importantly, gene network topology and differential expression analysis identified over-expression of the GPCR signaling regulators, the cAMP synthetase, ADCY2, and the protein kinase A, PRKAR1A. Seven CREB response element (CRE transcripts associated with proliferation and secretion: BEX1, BICD1, CHGB, CPE, GABRB3, SCG2 and SCG3 as well as ADCY2 and PRKAR1A were measured in an independent SI dataset (n = 10 NETs; n = 8 normal preparations. All were up-regulated (p<0.035 with the exception of SCG3 which was not differently expressed. Forskolin (a direct cAMP activator, 10(-5 M significantly stimulated transcription of pCREB and 3/7 CREB targets, isoproterenol (a selective ß-adrenergic receptor agonist and cAMP activator, 10(-5 M stimulated pCREB and 4/7 targets while BIM-53061 (a dopamine D(2 and Serotonin [5-HT(2] receptor agonist, 10(-6 M stimulated 100% of targets as well as pCREB; CRE transcription correlated with the levels of cAMP accumulation and PKA activity; BIM-53061 stimulated the highest levels of cAMP and PKA (2.8-fold and 2.5-fold vs. 1.8-2-fold for isoproterenol and forskolin. Gene network inference and graph topology analysis in SI NETs suggests that SI NETs express neural GPCRs that activate different CRE targets associated with proliferation and secretion. In vitro studies, in a model NET cell system, confirmed that transcriptional
Inferring the physical connectivity of complex networks from their functional dynamics
Directory of Open Access Journals (Sweden)
Holm Liisa
2010-05-01
Full Text Available Abstract Background Biological networks, such as protein-protein interactions, metabolic, signalling, transcription-regulatory networks and neural synapses, are representations of large-scale dynamic systems. The relationship between the network structure and functions remains one of the central problems in current multidisciplinary research. Significant progress has been made toward understanding the implication of topological features for the network dynamics and functions, especially in biological networks. Given observations of a network system's behaviours or measurements of its functional dynamics, what can we conclude of the details of physical connectivity of the underlying structure? Results We modelled the network system by employing a scale-free network of coupled phase oscillators. Pairwise phase coherence (PPC was calculated for all the pairs of oscillators to present functional dynamics induced by the system. At the regime of global incoherence, we observed a Significant pairwise synchronization only between two nodes that are physically connected. Right after the onset of global synchronization, disconnected nodes begin to oscillate in a correlated fashion and the PPC of two nodes, either connected or disconnected, depends on their degrees. Based on the observation of PPCs, we built a weighted network of synchronization (WNS, an all-to-all functionally connected network where each link is weighted by the PPC of two oscillators at the ends of the link. In the regime of strong coupling, we observed a Significant similarity in the organization of WNSs induced by systems sharing the same substrate network but different configurations of initial phases and intrinsic frequencies of oscillators. We reconstruct physical network from the WNS by choosing the links whose weights are higher than a given threshold. We observed an optimal reconstruction just before the onset of global synchronization. Finally, we correlated the topology of the
Kwak, Sooyeong; Bae, Guntae; Kim, Manbae; Byun, Hyeran
2008-02-01
In this paper, we propose a method for detecting unusual human behavior using monocular camera which is not moving. Our system composed of three modules which are moving object detection, tracking, and event recognition. The key part is event recognition module. We define unusual events which are composed of two simple events (drop off luggage, unattended luggage) and two complex events (abandoned luggage and steal luggage). In order to detect the simple event, we construct Bayesian network in each unusual event. We extract evidences using bounding box properties which are the location of moving objects, speed, distance between the person and the other moving object (such as bag), existing time. And then, we use finite state automaton which shows the temporal relation of two simple events to detect complex events. To evaluate the performance, we compare the frame number when an even is triggered with our results and the ground truth. The proposed algorithm showed good results on the real world environment and also worked at real time speed.
Azeez, Dhifaf; Ali, Mohd Alauddin Mohd; Gan, Kok Beng; Saiboon, Ismail
2013-01-01
Unexpected disease outbreaks and disasters are becoming primary issues facing our world. The first points of contact either at the disaster scenes or emergency department exposed the frontline workers and medical physicians to the risk of infections. Therefore, there is a persuasive demand for the integration and exploitation of heterogeneous biomedical information to improve clinical practice, medical research and point of care. In this paper, a primary triage model was designed using two different methods: an adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN).When the patient is presented at the triage counter, the system will capture their vital signs and chief complains beside physiology stat and general appearance of the patient. This data will be managed and analyzed in the data server and the patient's emergency status will be reported immediately. The proposed method will help to reduce the queue time at the triage counter and the emergency physician's burden especially duringdisease outbreak and serious disaster. The models have been built with 2223 data set extracted from the Emergency Department of the Universiti Kebangsaan Malaysia Medical Centre to predict the primary triage category. Multilayer feed forward with one hidden layer having 12 neurons has been used for the ANN architecture. Fuzzy subtractive clustering has been used to find the fuzzy rules for the ANFIS model. The results showed that the RMSE, %RME and the accuracy which evaluated by measuring specificity and sensitivity for binary classificationof the training data were 0.14, 5.7 and 99 respectively for the ANN model and 0.85, 32.00 and 96.00 respectively for the ANFIS model. As for unseen data the root mean square error, percentage the root mean square error and the accuracy for ANN is 0.18, 7.16 and 96.7 respectively, 1.30, 49.84 and 94 respectively for ANFIS model. The ANN model was performed better for both training and unseen data than ANFIS model in
An alternative empirical likelihood method in missing response problems and causal inference.
Ren, Kaili; Drummond, Christopher A; Brewster, Pamela S; Haller, Steven T; Tian, Jiang; Cooper, Christopher J; Zhang, Biao
2016-11-30
Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood-based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Inferring Association between Compound and Pathway with an Improved Ensemble Learning Method.
Song, Meiyue; Jiang, Zhenran
2015-11-01
Emergence of compound molecular data coupled to pathway information offers the possibility of using machine learning methods for compound-pathway associations' inference. To provide insights into the global relationship between compounds and their affected pathways, a improved Rotation Forest ensemble learning method called RGRF (Relief & GBSSL - Rotation Forest) was proposed to predict their potential associations. The main characteristic of the RGRF lies in using the Relief algorithm for feature extraction and regarding the Graph-Based Semi-Supervised Learning method as classifier. By incorporating the chemical structure information, drug mode of action information and genomic space information, our method can achieve a better precision and flexibility on compound-pathway prediction. Moreover, several new compound-pathway associations that having the potential for further clinical investigation have been identified by database searching. In the end, a prediction tool was developed using RGRF algorithm, which can predict the interactions between pathways and all of the compounds in cMap database. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Exploration Knowledge Sharing Networks Using Social Network Analysis Methods
Directory of Open Access Journals (Sweden)
Győző Attila Szilágyi
2017-10-01
Full Text Available Knowledge sharing within organization is one of the key factor for success. The organization, where knowledge sharing takes place faster and more efficiently, is able to adapt to changes in the market environment more successfully, and as a result, it may obtain a competitive advantage. Knowledge sharing in an organization is carried out through formal and informal human communication contacts during work. This forms a multi-level complex network whose quantitative and topological characteristics largely determine how quickly and to what extent the knowledge travels within organization. The study presents how different networks of knowledge sharing in the organization can be explored by means of network analysis methods through a case study, and which role play the properties of these networks in fast and sufficient spread of knowledge in organizations. The study also demonstrates the practical applications of our research results. Namely, on the basis of knowledge sharing educational strategies can be developed in an organization, and further, competitiveness of an organization may increase due to those strategies’ application.
Inference in Belief Network using Logic Sampling and Likelihood Weighing algorithms
Directory of Open Access Journals (Sweden)
K. S. JASMINE
2013-11-01
Full Text Available Over the time in computational history, belief networks have become an increasingly popular mechanism for dealing with uncertainty in systems. It is known that identifying the probability values of belief network nodes given a set of evidence is not amenable in general. Many different simulation algorithms for approximating solution to this problem have been proposed and implemented. This paper details the implementation of such algorithms, in particular the two algorithms of the belief networks namely Logic sampling and the likelihood weighing are discussed. A detailed description of the algorithm is given with observed results. These algorithms play crucial roles in dynamic decision making in any situation of uncertainty.
National Research Council Canada - National Science Library
Sasaki, Masao S; Tachibana, Akira; Takeda, Shunichi
2014-01-01
.... To deal with these difficulties, a novel nonparametric statistics based on the ‘integrate-and-fire’ algorithm of artificial neural networks was developed and tested in cancer databases established by the Radiation Effects Research Foundation...
Czech Academy of Sciences Publication Activity Database
Djordjilović, V.; Chiogna, M.; Vomlel, Jiří
2017-01-01
Roč. 88, č. 1 (2017), s. 602-613 ISSN 0888-613X R&D Projects: GA ČR(CZ) GA16-12010S Institutional support: RVO:67985556 Keywords : Bayesian networks * Structure learning * Reverse engineering * Gene networks Subject RIV: JD - Computer Applications, Robotics Impact factor: 2.845, year: 2016 http:// library .utia.cas.cz/separaty/2017/MTR/vomlel-0477168.pdf
Network structure from relational data: measurement and inference in four operational models
Bradley, Raymond Trevor; Roberts, Nancy C.
1989-01-01
An empirically-based assessment of the operational procedures routinely used in network analysis reveais serious measurement deficiencies that render spurious images of network structure. Based on explicit, exhaustive measurement along three basic relational dimensions, an alternative approach is described that resolves these problems. The three dimensions (type of relation, the relation’s existential status, and level of analysis) combine to create a general f...
Clare, John; McKinney, Shawn T.; DePue, John E.; Loftin, Cynthia S.
2017-01-01
individuals more readily than passive hair catches. Inability to photographically distinguish individual sex did not appear to induce negative bias in camera density estimates; instead, hair catches appeared to produce detection competition between individuals that may have been a source of negative bias. Our model reformulations broaden the range of circumstances in which analyses incorporating multiple sources of information can be robustly used, and our empirical results demonstrate that using multiple field-methods can enhance inferences regarding ecological parameters of interest and improve understanding of how reliably survey methods sample these parameters.
Inferring and analysis of social networks using RFID check-in data in China
Liu, Tao; Liu, Shouyin; Ge, Shuangkui
2017-01-01
Social networks play an important role in our daily lives. However, social ties are rather elusive to quantify, especially for large groups of subjects over prolonged periods of time. In this work, we first propose a methodology for extracting social ties from long spatio-temporal data streams, where the subjects are 17,795 undergraduates from a university of China and the data streams are the 9,147,106 time-stamped RFID check-in records left behind by them during one academic year. By several metrics mentioned below, we then analyze the structure of the social network. Our results center around three main observations. First, we characterize the global structure of the network, and we confirm the small-world phenomenon on a global scale. Second, we find that the network shows clear community structure. And we observe that younger students at lower levels tend to form large communities, while students at higher levels mostly form smaller communities. Third, we characterize the assortativity patterns by studying the basic demographic and network properties of users. We observe clear degree assortativity on a global scale. Furthermore, we find a strong effect of grade and school on tie formation preference, but we do not find any strong region homophily. Our research may help us to elucidate the structural characteristics and the preference of the formation of social ties in college students’ social network. PMID:28570586
Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network.
Li, Yongjin; Patra, Jagdish C
2010-05-01
Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene-phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype-gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene-phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip.
Chakraborty, Arindom; Jiang, Guanglong; Boustani, Malaz; Liu, Yunlong; Skaar, Todd; Li, Lang
2013-01-01
Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~10(6)). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative. The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDRmethods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.
Haga, Tatsuya; Fukayama, Osamu; Takayama, Yuzo; Hoshino, Takayuki; Mabuchi, Kunihiko
2013-09-30
Overlapping of extracellularly recorded neural spike waveforms causes the original spike waveforms to become hidden and merged, confounding the real-time detection and sorting of these spikes. Methods proposed for solving this problem include using a multi-trode or placing a restriction on the complexity of overlaps. In this paper, we propose a rapid sequential method for the robust detection and sorting of arbitrarily overlapped spikes recorded with arbitrary types of electrodes. In our method, the probabilities of possible spike trains, including those that are overlapping, are evaluated by sequential Bayesian inference based on probabilistic models of spike-train generation and extracellular voltage recording. To reduce the high computational cost inherent in an exhaustive evaluation, candidates with low probabilities are considered as impossible candidates and are abolished at each sampling time to limit the number of candidates in the next evaluation. In addition, the data from a few subsequent sampling times are considered and used to calculate the "look-ahead probability", resulting in improved calculation efficiency due to a more rapid elimination of candidates. These sufficiently reduce computational time to enable real-time calculation without impairing performance. We assessed the performance of our method using simulated neural signals and actual neural signals recorded in primary cortical neurons cultured on a multi-electrode array. Our results demonstrated that our computational method could be applied in real-time with a delay of less than 10 ms. The estimation accuracy was higher than that of a conventional spike sorting method, particularly for signals with multiple overlapping spikes. Copyright © 2013 Elsevier B.V. All rights reserved.
Identifiability and inference of pathway motifs by epistasis analysis
Phenix, Hilary; Perkins, Theodore; Kærn, Mads
2013-06-01
The accuracy of genetic network inference is limited by the assumptions used to determine if one hypothetical model is better than another in explaining experimental observations. Most previous work on epistasis analysis—in which one attempts to infer pathway relationships by determining equivalences among traits following mutations—has been based on Boolean or linear models. Here, we delineate the ultimate limits of epistasis-based inference by systematically surveying all two-gene network motifs and use symbolic algebra with arbitrary regulation functions to examine trait equivalences. Our analysis divides the motifs into equivalence classes, where different genetic perturbations result in indistinguishable experimental outcomes. We demonstrate that this partitioning can reveal important information about network architecture, and show, using simulated data, that it greatly improves the accuracy of genetic network inference methods. Because of the minimal assumptions involved, equivalence partitioning has broad applicability for gene network inference.
Directory of Open Access Journals (Sweden)
Chen Chang-Han
2007-02-01
Full Text Available Abstract Background The significant advances in microarray and proteomics analyses have resulted in an exponential increase in potential new targets and have promised to shed light on the identification of disease markers and cellular pathways. We aim to collect and decipher the HCC-related genes at the systems level. Results Here, we build an integrative platform, the Encyclopedia of Hepatocellular Carcinoma genes Online, dubbed EHCO http://ehco.iis.sinica.edu.tw, to systematically collect, organize and compare the pileup of unsorted HCC-related studies by using natural language processing and softbots. Among the eight gene set collections, ranging across PubMed, SAGE, microarray, and proteomics data, there are 2,906 genes in total; however, more than 77% genes are only included once, suggesting that tremendous efforts need to be exerted to characterize the relationship between HCC and these genes. Of these HCC inventories, protein binding represents the largest proportion (~25% from Gene Ontology analysis. In fact, many differentially expressed gene sets in EHCO could form interaction networks (e.g. HBV-associated HCC network by using available human protein-protein interaction datasets. To further highlight the potential new targets in the inferred network from EHCO, we combine comparative genomics and interactomics approaches to analyze 120 evolutionary conserved and overexpressed genes in HCC. 47 out of 120 queries can form a highly interactive network with 18 queries serving as hubs. Conclusion This architectural map may represent the first step toward the attempt to decipher the hepatocarcinogenesis at the systems level. Targeting hubs and/or disruption of the network formation might reveal novel strategy for HCC treatment.
Directory of Open Access Journals (Sweden)
Eli eShlizerman
2014-08-01
Full Text Available The antennal lobe (AL, olfactory processing center in insects, is able to process stimuli into distinct neural activity patterns, called olfactory neural codes. To model their dynamics we perform multichannel recordings from the projection neurons in the AL driven by different odorants. We then derive a dynamic neuronal network from the electrophysiological data. The network consists of lateral-inhibitory neurons and excitatory neurons (modeled as firing-rate units, and is capable of producing unique olfactory neural codes for the tested odorants. To construct the network, we (i design a projection, an odor space, for the neural recording from the AL, which discriminates between distinct odorants trajectories (ii characterize scent recognition, i.e., decision-making based on olfactory signals and (iii infer the wiring of the neural circuit, the connectome of the AL. We show that the constructed model is consistent with biological observations, such as contrast enhancement and robustness to noise. The study suggests a data-driven approach to answer a key biological question in identifying how lateral inhibitory neurons can be wired to excitatory neurons to permit robust activity patterns.
Shlizerman, Eli; Riffell, Jeffrey A; Kutz, J Nathan
2014-01-01
The antennal lobe (AL), olfactory processing center in insects, is able to process stimuli into distinct neural activity patterns, called olfactory neural codes. To model their dynamics we perform multichannel recordings from the projection neurons in the AL driven by different odorants. We then derive a dynamic neuronal network from the electrophysiological data. The network consists of lateral-inhibitory neurons and excitatory neurons (modeled as firing-rate units), and is capable of producing unique olfactory neural codes for the tested odorants. To construct the network, we (1) design a projection, an odor space, for the neural recording from the AL, which discriminates between distinct odorants trajectories (2) characterize scent recognition, i.e., decision-making based on olfactory signals and (3) infer the wiring of the neural circuit, the connectome of the AL. We show that the constructed model is consistent with biological observations, such as contrast enhancement and robustness to noise. The study suggests a data-driven approach to answer a key biological question in identifying how lateral inhibitory neurons can be wired to excitatory neurons to permit robust activity patterns.
Spectral methods for network community detection and graph partitioning
Newman, M.E.J.
2013-01-01
We consider three distinct and well studied problems concerning network structure: community detection by modularity maximization, community detection by statistical inference, and normalized-cut graph partitioning. Each of these problems can be tackled using spectral algorithms that make use of the eigenvectors of matrix representations of the network. We show that with certain choices of the free parameters appearing in these spectral algorithms the algorithms for all three problems are, in...
Wang, Liping; Ding, Xue; Xiao, Jiajing; Jiménez-Gόngora, Tamara; Liu, Renyi; Lozano-Durán, Rosa
2017-09-25
Viruses reshape the intracellular environment of their hosts, largely through protein-protein interactions, to co-opt processes necessary for viral infection and interference with antiviral defences. Due to genome size constraints and the concomitant limited coding capacity of viruses, viral proteins are generally multifunctional and have evolved to target diverse host proteins. Inference of the virus-host interaction network can be instrumental for understanding how viruses manipulate the host machinery and how re-wiring of specific pathways can contribute to disease. Here, we use affinity purification and mass spectrometry analysis (AP-MS) to define the global landscape of interactions between the geminivirus Tomato yellow leaf curl virus (TYLCV) and its host Nicotiana benthamiana. For this purpose, we expressed tagged versions of each of TYLCV-encoded proteins (C1/Rep, C2/TrAP, C3/REn, C4, V2, and CP) in planta in the presence of the virus. Using a quantitative scoring system, 728 high-confidence plant interactors were identified, and the interaction network of each viral protein was inferred; TYLCV-targeted proteins are more connected than average, and connect with other proteins through shorter paths, which would allow the virus to exert large effects with few interactions. Comparative analyses of divergence patterns between N. benthamiana and potato, a non-host Solanaceae, showed evolutionary constraints on TYLCV-targeted proteins. Our results provide a comprehensive overview of plant proteins targeted by TYLCV during the viral infection, which may contribute to uncovering the underlying molecular mechanisms of plant viral diseases and provide novel potential targets for anti-viral strategies and crop engineering. Interestingly, some of the TYLCV-interacting proteins appear to be convergently targeted by other pathogen effectors, which suggests a central role for these proteins in plant-pathogen interactions, and pinpoints them as potential targets to
DEFF Research Database (Denmark)
Meng, Weizhi; Li, Wenjuan; Xiang, Yang
2017-01-01
With the increasing digitization of the healthcare industry, a wide range of devices (including traditionally non-networked medical devices) are Internet- and inter-connected. Mobile devices (e.g. smartphones) are one common device used in the healthcare industry to improve the quality of service...
A Bayesian network approach for causal inferences in pesticide risk assessment and management
Pesticide risk assessment and management must balance societal benefits and ecosystem protection, based on quantified risks and the strength of the causal linkages between uses of the pesticide and socioeconomic and ecological endpoints of concern. A Bayesian network (BN) is a gr...
Computer methods in electric network analysis
Energy Technology Data Exchange (ETDEWEB)
Saver, P.; Hajj, I.; Pai, M.; Trick, T.
1983-06-01
The computational algorithms utilized in power system analysis have more than just a minor overlap with those used in electronic circuit computer aided design. This paper describes the computer methods that are common to both areas and highlights the differences in application through brief examples. Recognizing this commonality has stimulated the exchange of useful techniques in both areas and has the potential of fostering new approaches to electric network analysis through the interchange of ideas.
de Queiroz, K; Poe, S
2001-06-01
Advocates of cladistic parsimony methods have invoked the philosophy of Karl Popper in an attempt to argue for the superiority of those methods over phylogenetic methods based on Ronald Fisher's statistical principle of likelihood. We argue that the concept of likelihood in general, and its application to problems of phylogenetic inference in particular, are highly compatible with Popper's philosophy. Examination of Popper's writings reveals that his concept of corroboration is, in fact, based on likelihood. Moreover, because probabilistic assumptions are necessary for calculating the probabilities that define Popper's corroboration, likelihood methods of phylogenetic inference--with their explicit probabilistic basis--are easily reconciled with his concept. In contrast, cladistic parsimony methods, at least as described by certain advocates of those methods, are less easily reconciled with Popper's concept of corroboration. If those methods are interpreted as lacking probabilistic assumptions, then they are incompatible with corroboration. Conversely, if parsimony methods are to be considered compatible with corroboration, then they must be interpreted as carrying implicit probabilistic assumptions. Thus, the non-probabilistic interpretation of cladistic parsimony favored by some advocates of those methods is contradicted by an attempt by the same authors to justify parsimony methods in terms of Popper's concept of corroboration. In addition to being compatible with Popperian corroboration, the likelihood approach to phylogenetic inference permits researchers to test the assumptions of their analytical methods (models) in a way that is consistent with Popper's ideas about the provisional nature of background knowledge.
Spectral Analysis Methods of Social Networks
Directory of Open Access Journals (Sweden)
P. G. Klyucharev
2017-01-01
Full Text Available Online social networks (such as Facebook, Twitter, VKontakte, etc. being an important channel for disseminating information are often used to arrange an impact on the social consciousness for various purposes - from advertising products or services to the full-scale information war thereby making them to be a very relevant object of research. The paper reviewed the analysis methods of social networks (primarily, online, based on the spectral theory of graphs. Such methods use the spectrum of the social graph, i.e. a set of eigenvalues of its adjacency matrix, and also the eigenvectors of the adjacency matrix.Described measures of centrality (in particular, centrality based on the eigenvector and PageRank, which reflect a degree of impact one or another user of the social network has. A very popular PageRank measure uses, as a measure of centrality, the graph vertices, the final probabilities of the Markov chain, whose matrix of transition probabilities is calculated on the basis of the adjacency matrix of the social graph. The vector of final probabilities is an eigenvector of the matrix of transition probabilities.Presented a method of dividing the graph vertices into two groups. It is based on maximizing the network modularity by computing the eigenvector of the modularity matrix.Considered a method for detecting bots based on the non-randomness measure of a graph to be computed using the spectral coordinates of vertices - sets of eigenvector components of the adjacency matrix of a social graph.In general, there are a number of algorithms to analyse social networks based on the spectral theory of graphs. These algorithms show very good results, but their disadvantage is the relatively high (albeit polynomial computational complexity for large graphs.At the same time it is obvious that the practical application capacity of the spectral graph theory methods is still underestimated, and it may be used as a basis to develop new methods.The work
Methods and applications for detecting structure in complex networks
Leicht, Elizabeth A.
The use of networks to represent systems of interacting components is now common in many fields including the biological, physical, and social sciences. Network models are widely applicable due to their relatively simple framework of vertices and edges. Network structure, patterns of connection between vertices, impacts both the functioning of networks and processes occurring on networks. However, many aspects of network structure are still poorly understood. This dissertation presents a set of network analysis methods and applications to real-world as well as simulated networks. The methods are divided into two main types: linear algebra formulations and probabilistic mixture model techniques. Network models lend themselves to compact mathematical representation as matrices, making linear algebra techniques useful probes of network structure. We present methods for the detection of two distinct, but related, network structural forms. First, we derive a measure of vertex similarity based upon network structure. The method builds on existing ideas concerning calculation of vertex similarity, but generalizes and extends the scope to large networks. Second, we address the detection of communities or modules in a specific class of networks, directed networks. We propose a method for detecting community structure in directed networks, which is an extension of a community detection method previously only known for undirected networks. Moving away from linear algebra formulations, we propose two methods for network structure detection based on probabilistic techniques. In the first method, we use the machinery of the expectation-maximization (EM) algorithm to probe patterns of connection among vertices in static networks. The technique allows for the detection of a broad range of types of structure in networks. The second method focuses on time evolving networks. We propose an application of the EM algorithm to evolving networks that can reveal significant structural
Inference of Network Dynamics and Metabolic Interactions in the Gut Microbiome.
Directory of Open Access Journals (Sweden)
Steven N Steinway
2015-05-01
Full Text Available We present a novel methodology to construct a Boolean dynamic model from time series metagenomic information and integrate this modeling with genome-scale metabolic network reconstructions to identify metabolic underpinnings for microbial interactions. We apply this in the context of a critical health issue: clindamycin antibiotic treatment and opportunistic Clostridium difficile infection. Our model recapitulates known dynamics of clindamycin antibiotic treatment and C. difficile infection and predicts therapeutic probiotic interventions to suppress C. difficile infection. Genome-scale metabolic network reconstructions reveal metabolic differences between community members and are used to explore the role of metabolism in the observed microbial interactions. In vitro experimental data validate a key result of our computational model, that B. intestinihominis can in fact slow C. difficile growth.
Inferring long memory processes in the climate network via ordinal pattern analysis
Barreiro, Marcelo; Masoller, Cristina
2010-01-01
We use ordinal patterns and symbolic analysis to construct global climate networks and uncover long and short term memory processes. The data analyzed is the monthly averaged surface air temperature (SAT field) and the results suggest that the time variability of the SAT field is determined by patterns of oscillatory behavior that repeat from time to time, with a periodicity related to intraseasonal oscillations and to El Ni\\~{n}o on seasonal-to-interannual time scales.
Inferring social status and rich club effects in enterprise communication networks.
Dong, Yuxiao; Tang, Jie; Chawla, Nitesh V; Lou, Tiancheng; Yang, Yang; Wang, Bai
2015-01-01
Social status, defined as the relative rank or position that an individual holds in a social hierarchy, is known to be among the most important motivating forces in social behaviors. In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks. To that end, we use two enterprise datasets with three communication channels--voice call, short message, and email--to demonstrate the social-behavioral differences among individuals with different status. We have several interesting findings and based on these findings we also develop a model to predict social status. On the individual level, high-status individuals are more likely to be spanned as structural holes by linking to people in parts of the enterprise networks that are otherwise not well connected to one another. On the community level, the principle of homophily, social balance and clique theory generally indicate a "rich club" maintained by high-status individuals, in the sense that this community is much more connected, balanced and dense. Our model can predict social status of individuals with 93% accuracy.
Mc Mahon, Siobhan S; Sim, Aaron; Filippi, Sarah; Johnson, Robert; Liepe, Juliane; Smith, Dominic; Stumpf, Michael P H
2014-11-01
Sensing and responding to the environment are two essential functions that all biological organisms need to master for survival and successful reproduction. Developmental processes are marshalled by a diverse set of signalling and control systems, ranging from systems with simple chemical inputs and outputs to complex molecular and cellular networks with non-linear dynamics. Information theory provides a powerful and convenient framework in which such systems can be studied; but it also provides the means to reconstruct the structure and dynamics of molecular interaction networks underlying physiological and developmental processes. Here we supply a brief description of its basic concepts and introduce some useful tools for systems and developmental biologists. Along with a brief but thorough theoretical primer, we demonstrate the wide applicability and biological application-specific nuances by way of different illustrative vignettes. In particular, we focus on the characterisation of biological information processing efficiency, examining cell-fate decision making processes, gene regulatory network reconstruction, and efficient signal transduction experimental design. Copyright © 2014 Elsevier Ltd. All rights reserved.
Tumor Diagnosis Using Backpropagation Neural Network Method
Ma, Lixing; Looney, Carl; Sukuta, Sydney; Bruch, Reinhard; Afanasyeva, Natalia
1998-05-01
For characterization of skin cancer, an artificial neural network (ANN) method has been developed to diagnose normal tissue, benign tumor and melanoma. The pattern recognition is based on a three-layer neural network fuzzy learning system. In this study, the input neuron data set is the Fourier Transform infrared (FT-IR)spectrum obtained by a new Fiberoptic Evanescent Wave Fourier Transform Infrared (FEW-FTIR) spectroscopy method in the range of 1480 to 1850 cm-1. Ten input features are extracted from the absorbency values in this region. A single hidden layer of neural nodes with sigmoids activation functions clusters the feature space into small subclasses and the output nodes are separated in different nonconvex classes to permit nonlinear discrimination of disease states. The output is classified as three classes: normal tissue, benign tumor and melanoma. The results obtained from the neural network pattern recognition are shown to be consistent with traditional medical diagnosis. Input features have also been extracted from the absorbency spectra using chemical factor analysis. These abstract features or factors are also used in the classification.
A Machine-learning Method to Infer Fundamental Stellar Parameters from Photometric Light Curves
Miller, A. A.; Bloom, J. S.; Richards, J. W.; Lee, Y. S.; Starr, D. L.; Butler, N. R.; Tokarz, S.; Smith, N.; Eisner, J. A.
2015-01-01
A fundamental challenge for wide-field imaging surveys is obtaining follow-up spectroscopic observations: there are >109 photometrically cataloged sources, yet modern spectroscopic surveys are limited to ~few× 106 targets. As we approach the Large Synoptic Survey Telescope era, new algorithmic solutions are required to cope with the data deluge. Here we report the development of a machine-learning framework capable of inferring fundamental stellar parameters (T eff, log g, and [Fe/H]) using photometric-brightness variations and color alone. A training set is constructed from a systematic spectroscopic survey of variables with Hectospec/Multi-Mirror Telescope. In sum, the training set includes ~9000 spectra, for which stellar parameters are measured using the SEGUE Stellar Parameters Pipeline (SSPP). We employed the random forest algorithm to perform a non-parametric regression that predicts T eff, log g, and [Fe/H] from photometric time-domain observations. Our final optimized model produces a cross-validated rms error (RMSE) of 165 K, 0.39 dex, and 0.33 dex for T eff, log g, and [Fe/H], respectively. Examining the subset of sources for which the SSPP measurements are most reliable, the RMSE reduces to 125 K, 0.37 dex, and 0.27 dex, respectively, comparable to what is achievable via low-resolution spectroscopy. For variable stars this represents a ≈12%-20% improvement in RMSE relative to models trained with single-epoch photometric colors. As an application of our method, we estimate stellar parameters for ~54,000 known variables. We argue that this method may convert photometric time-domain surveys into pseudo-spectrographic engines, enabling the construction of extremely detailed maps of the Milky Way, its structure, and history.
Bae, Jonghoon; Cha, Young-Jae; Lee, Hyungsuk; Lee, Boyun; Baek, Sojung; Choi, Semin; Jang, Dayk
2017-01-01
This study examines whether the way that a person makes inferences about unknown events is associated with his or her social relations, more precisely, those characterized by ego network density that reflects the structure of a person's immediate social relation. From the analysis of individual predictions over the Go match between AlphaGo and Sedol Lee in March 2016 in Seoul, Korea, this study shows that the low-density group scored higher than the high-density group in the accuracy of the prediction over a future state of a social event, i.e., the outcome of the first game. We corroborated this finding with three replication tests that asked the participants to predict the following: film awards, President Park's impeachment in Korea, and the counterfactual assessment of the US presidential election. Taken together, this study suggests that network density is negatively associated with vision advantage, i.e., the ability to discover and forecast an unknown aspect of a social event.
Directory of Open Access Journals (Sweden)
Jonghoon Bae
Full Text Available This study examines whether the way that a person makes inferences about unknown events is associated with his or her social relations, more precisely, those characterized by ego network density that reflects the structure of a person's immediate social relation. From the analysis of individual predictions over the Go match between AlphaGo and Sedol Lee in March 2016 in Seoul, Korea, this study shows that the low-density group scored higher than the high-density group in the accuracy of the prediction over a future state of a social event, i.e., the outcome of the first game. We corroborated this finding with three replication tests that asked the participants to predict the following: film awards, President Park's impeachment in Korea, and the counterfactual assessment of the US presidential election. Taken together, this study suggests that network density is negatively associated with vision advantage, i.e., the ability to discover and forecast an unknown aspect of a social event.
Global dynamic evolution of the cold plasma inferred with neural networks
Zhelavskaya, I. S.; Shprits, Y. Y.; Spasojevic, M.
2016-12-01
The electron number density is a fundamental parameter of plasmas and a critical parameter in the wave-particle interactions. However, the distribution of cold plasma and its dynamic dependence on solar wind conditions remains poorly quantified. Existing empirical models provide us with statistical averages based on static geomagnetic parameters, but cannot reflect the dynamics of the highly structured and quickly varying plasmasphere environment, especially during times of high geomagnetic activity. Global imaging provides insights on the dynamics but does not provide quantitative estimates of number density. Accurately calculating the evolving distribution from first principles has also proven elusive due to the sheer number of physical processes involved.In this study, we propose an empirical model for reconstruction of global dynamics of the cold plasma density distribution based only on solar wind data and geomagnetic indices. We develop a neural network that is capable of globally reconstructing the dynamics of the cold plasma density distribution for L shells from 2 to 6 and all local times. First, we derive a plasma density database by using the NURD algorithm to identify the upper hybrid resonance band in plasma wave observations from Van Allen Probes [Zhelavskaya et al., 2016]. Then, we utilize the density database in conjunction with solar wind data and geomagnetic indices to train the neural network. To validate and test the model, we choose validation and test sets independently from the density database. We validate and test the neural network by measuring its performance on these sets and also by comparing the model predicted global evolution with global images of the He+ distribution in the Earth's plasmasphere from the IMAGE extreme ultraviolet (EUV) instrument.The present study demonstrates how we can reconstruct the global dynamics from local in-situ observations by using machine learning tools. We describe aspects of the validation process in
Methods for extracting social network data from chatroom logs
Osesina, O. Isaac; McIntire, John P.; Havig, Paul R.; Geiselman, Eric E.; Bartley, Cecilia; Tudoreanu, M. Eduard
2012-06-01
Identifying social network (SN) links within computer-mediated communication platforms without explicit relations among users poses challenges to researchers. Our research aims to extract SN links in internet chat with multiple users engaging in synchronous overlapping conversations all displayed in a single stream. We approached this problem using three methods which build on previous research. Response-time analysis builds on temporal proximity of chat messages; word context usage builds on keywords analysis and direct addressing which infers links by identifying the intended message recipient from the screen name (nickname) referenced in the message [1]. Our analysis of word usage within the chat stream also provides contexts for the extracted SN links. To test the capability of our methods, we used publicly available data from Internet Relay Chat (IRC), a real-time computer-mediated communication (CMC) tool used by millions of people around the world. The extraction performances of individual methods and their hybrids were assessed relative to a ground truth (determined a priori via manual scoring).
Generalized method of moments for estimating parameters of stochastic reaction networks.
Lück, Alexander; Wolf, Verena
2016-10-21
Discrete-state stochastic models have become a well-established approach to describe biochemical reaction networks that are influenced by the inherent randomness of cellular events. In the last years several methods for accurately approximating the statistical moments of such models have become very popular since they allow an efficient analysis of complex networks. We propose a generalized method of moments approach for inferring the parameters of reaction networks based on a sophisticated matching of the statistical moments of the corresponding stochastic model and the sample moments of population snapshot data. The proposed parameter estimation method exploits recently developed moment-based approximations and provides estimators with desirable statistical properties when a large number of samples is available. We demonstrate the usefulness and efficiency of the inference method on two case studies. The generalized method of moments provides accurate and fast estimations of unknown parameters of reaction networks. The accuracy increases when also moments of order higher than two are considered. In addition, the variance of the estimator decreases, when more samples are given or when higher order moments are included.
Brain networks for confidence weighting and hierarchical inference during probabilistic learning.
Meyniel, Florent; Dehaene, Stanislas
2017-05-09
Learning is difficult when the world fluctuates randomly and ceaselessly. Classical learning algorithms, such as the delta rule with constant learning rate, are not optimal. Mathematically, the optimal learning rule requires weighting prior knowledge and incoming evidence according to their respective reliabilities. This "confidence weighting" implies the maintenance of an accurate estimate of the reliability of what has been learned. Here, using fMRI and an ideal-observer analysis, we demonstrate that the brain's learning algorithm relies on confidence weighting. While in the fMRI scanner, human adults attempted to learn the transition probabilities underlying an auditory or visual sequence, and reported their confidence in those estimates. They knew that these transition probabilities could change simultaneously at unpredicted moments, and therefore that the learning problem was inherently hierarchical. Subjective confidence reports tightly followed the predictions derived from the ideal observer. In particular, subjects managed to attach distinct levels of confidence to each learned transition probability, as required by Bayes-optimal inference. Distinct brain areas tracked the likelihood of new observations given current predictions, and the confidence in those predictions. Both signals were combined in the right inferior frontal gyrus, where they operated in agreement with the confidence-weighting model. This brain region also presented signatures of a hierarchical process that disentangles distinct sources of uncertainty. Together, our results provide evidence that the sense of confidence is an essential ingredient of probabilistic learning in the human brain, and that the right inferior frontal gyrus hosts a confidence-based statistical learning algorithm for auditory and visual sequences.
Convective rain cell contours inferred from a very dense gauge network
Teschl, Reinhard; Teschl, Franz; Fuchsberger, Jürgen
2017-04-01
Statistical information on the size of rain cells is of interest to a variety of disciplines: from meteorology and hydrology to microwave propagation e.g. for planning satellite communication systems. Rain cell size distributions are often based on weather radar data because of the high spatial and temporal resolution. The measuring accuracy of ground-based in situ sensors like rain gauges is admittedly higher, however, typical rain gauge networks exhibit a too coarse grid to adequately capture the spatial variability of precipitation, especially of convective cells. In the course of the present work, data originating from a very dense rain-gauge network was used: WegenerNet is a climate station network in Styria, Austria, consisting of 153 stations within an area of about 20 km × 15 km. The network provides well serviced and supervised datasets since January 2007. Multilevel quality flags are used to indicate integrity and plausibility of the data. Based on the point measurements of rainfall, interpolations on a 200 m × 200 m grid are provided. The detection of rain cells in the grid-data was accomplished by identifying contiguous areas where the rain rate is equal to or higher than a specified threshold value. Once a connected area of a defined magnitude was identified, its dimension was determined and the equivalent circular diameter of the rain cell was calculated. Only rain cells with contours higher than 5 mm per 5 minutes were considered, because the study area with its about 300 square kilometers often did not allow the complete detection of more widespread rainfall events associated with lower intensity contours. In any case it was made sure that rain cells, which were only partially detected, did not distort the results. The period of observation comprises up to now a 7-year timespan from 2010 to 2016. An extension of the period back to 2007 is planned in order to take advantage of full 10 years of high-resolution data. For the analysis only intervals
The research of elevator health diagnosis method based on Bayesian network
Liu, Chang; Zhang, Xinzheng; Liu, Xindong; Chen, Can
2017-08-01
Elevator, as a complex mechanical system, is hard to determine the factors that affect components’ status. In accordance with this special characteristic, the Elevator Fault Diagnosis Model is proposed based on Bayesian Network in this paper. The method uses different samples of the elevator and adopts Monte Carlo inference mechanism for Bayesian Network Model structure and parameter learning. Eventually, an elevator fault diagnosis model based on Bayesian network is established, which accords with the theory of elevator operation. In this paper, we use different kinds of fault data samples to test the method. Experimental results demonstrate the higher accuracy of our method. This paper provides a good assistant method by means of Fault prediction and Health diagnosis of elevator system at present.
Global dynamic evolution of the cold plasma inferred with neural networks
Zhelavskaya, Irina; Shprits, Yuri; Spasojevic, Maria
2017-04-01
The electron number density is a fundamental parameter of plasmas and is critical for the wave-particle interactions. Despite its global importance, the distribution of cold plasma and its dynamic dependence on solar wind conditions remains poorly quantified. Existing empirical models present statistical averages based on static geomagnetic parameters, but cannot reflect the dynamics of the highly structured and quickly varying plasmasphere environment, especially during times of high geomagnetic activity. Global imaging provides insights on the dynamics but quantitative inversion to electron number density has been lacking. We propose an empirical model for reconstruction of global dynamics of the cold plasma density distribution based only on solar wind data and geomagnetic indices. We develop a neural network that is capable of globally reconstructing the dynamics of the cold plasma density distribution for L shells from 2 to 6 and all local times. We utilize the density database obtained using the NURD algorithm [Zhelavskaya et al., 2016] in conjunction with solar wind data and geomagnetic indices to train the neural network. This study demonstrates how the global dynamics can be reconstructed from local in-situ observations by using machine learning tools. We describe aspects of the validation process in detail and discuss the selected inputs to the model and their physical implication.
Alaimo, Salvatore; Bonnici, Vincenzo; Cancemi, Damiano; Ferro, Alfredo; Giugno, Rosalba; Pulvirenti, Alfredo
2015-01-01
The identification of drug-target interactions (DTI) is a costly and time-consuming step in drug discovery and design. Computational methods capable of predicting reliable DTI play an important role in the field. Algorithms may aim to design new therapies based on a single approved drug or a combination of them. Recently, recommendation methods relying on network-based inference in connection with knowledge coming from the specific domain have been proposed. Here we propose a web-based interface to the DT-Hybrid algorithm, which applies a recommendation technique based on bipartite network projection implementing resources transfer within the network. This technique combined with domain-specific knowledge expressing drugs and targets similarity is used to compute recommendations for each drug. Our web interface allows the users: (i) to browse all the predictions inferred by the algorithm; (ii) to upload their custom data on which they wish to obtain a prediction through a DT-Hybrid based pipeline; (iii) to help in the early stages of drug combinations, repositioning, substitution, or resistance studies by finding drugs that can act simultaneously on multiple targets in a multi-pathway environment. Our system is periodically synchronized with DrugBank and updated accordingly. The website is free, open to all users, and available at http://alpha.dmi.unict.it/dtweb/. Our web interface allows users to search and visualize information on drugs and targets eventually providing their own data to compute a list of predictions. The user can visualize information about the characteristics of each drug, a list of predicted and validated targets, associated enzymes and transporters. A table containing key information and GO classification allows the users to perform their own analysis on our data. A special interface for data submission allows the execution of a pipeline, based on DT-Hybrid, predicting new targets with the corresponding p-values expressing the reliability of
Localization value of seizure semiology analyzed by the conditional inference tree method.
Kim, Dong Wook; Jung, Ki-Young; Chu, Kon; Park, So-Hee; Lee, Seo-Young; Lee, Sang Kun
2015-09-01
Although accurate interpretation of seizures is important for the management of patients with epilepsy, studies on the localizing value of seizure semiology and the reliability of the semiology descriptions are scarce. The objective of our study is to investigate the accuracy of video-recorded seizure semiology in the classification and localization of epileptic seizures. We also evaluated the reliability of the semiology descriptions provided by the patients or their caregivers. Video-recorded clinical seizures from 831 consecutive patients (391 females; 31.7 ± 11.6 years) were analyzed retrospectively. Epileptic seizures were classified as generalized and partial seizures, and patients with partial seizures were further divided into five ictal onset areas. In order to analyze the diagnostic value of individual semiologic features for clinical diagnosis, we used the conditional inference tree method. Generalized and partial seizures were differentiated with high accuracy (97.1%), but the accuracy of localization among the five ictal onset areas was relatively low (56.1%), which was largely attributed to the difficulty in the discrimination between mesial and lateral temporal onset seizures. Lateralization of the ictal onset area in partial seizures was possible in 427 (55.1%) patients based on video analysis, nevertheless it was possible in only 158 (20.4%) patients based on historical semiology descriptions. The results of our study suggest that careful observation of seizure semiology may be useful for the differentiation of ictal onset areas. However, the semiologic differentiation between mesial and lateral temporal onset seizures is difficult, and historical semiologic descriptions should be interpreted carefully because of their low reliability. Copyright © 2015 Elsevier B.V. All rights reserved.
Tejedor, E.; Saz, M. A.; Esper, J.; Cuadrat, J. M.; de Luis, M.
2017-08-01
Drought recurrence in the Mediterranean is regarded as a fundamental factor for socioeconomic development and the resilience of natural systems in context of global change. However, knowledge of past droughts has been hampered by the absence of high-resolution proxies. We present a drought reconstruction for the northeast of the Iberian Peninsula based on a new dendrochronology network considering the Standardized Evapotranspiration Precipitation Index (SPEI). A total of 774 latewood width series from 387 trees of P. sylvestris and P. uncinata was combined in an interregional chronology. The new chronology, calibrated against gridded climate data, reveals a robust relationship with the SPEI representing drought conditions of July and August. We developed a summer drought reconstruction for the period 1734-2013 representative for the northeastern and central Iberian Peninsula. We identified 16 extremely dry and 17 extremely wet summers and four decadal scale dry and wet periods, including 2003-2013 as the driest episode of the reconstruction.
Mourier, Johann; Bass, Nathan Charles; Guttridge, Tristan L; Day, Joanna; Brown, Culum
2017-09-01
Accurately estimating contacts between animals can be critical in ecological studies such as examining social structure, predator-prey interactions or transmission of information and disease. While biotelemetry has been used successfully for such studies in terrestrial systems, it is still under development in the aquatic environment. Acoustic telemetry represents an attractive tool to investigate spatio-temporal behaviour of marine fish and has recently been suggested for monitoring underwater animal interactions. To evaluate the effectiveness of acoustic telemetry in recording interindividual contacts, we compared co-occurrence matrices deduced from three types of acoustic receivers varying in detection range in a benthic shark species. Our results demonstrate that (i) associations produced by acoustic receivers with a large detection range (i.e. Vemco VR2W) were significantly different from those produced by receivers with smaller ranges (i.e. Sonotronics miniSUR receivers and proximity loggers) and (ii) the position of individuals within their network, or centrality, also differed. These findings suggest that acoustic receivers with a large detection range may not be the best option to represent true social networks in the case of a benthic marine animal. While acoustic receivers are increasingly used by marine ecologists, we recommend users first evaluate the influence of detection range to depict accurate individual interactions before using these receivers for social or predator-prey studies. We also advocate for combining multiple receiver types depending on the ecological question being asked and the development of multi-sensor tags or testing of new automated proximity loggers, such as the Encounternet system, to improve the precision and accuracy of social and predator-prey interaction studies.
Comparative analysis of quantitative efficiency evaluation methods for transportation networks.
He, Yuxin; Qin, Jin; Hong, Jian
2017-01-01
An effective evaluation of transportation network efficiency could offer guidance for the optimal control of urban traffic. Based on the introduction and related mathematical analysis of three quantitative evaluation methods for transportation network efficiency, this paper compares the information measured by them, including network structure, traffic demand, travel choice behavior and other factors which affect network efficiency. Accordingly, the applicability of various evaluation methods is discussed. Through analyzing different transportation network examples it is obtained that Q-H method could reflect the influence of network structure, traffic demand and user route choice behavior on transportation network efficiency well. In addition, the transportation network efficiency measured by this method and Braess's Paradox can be explained with each other, which indicates a better evaluation of the real operation condition of transportation network. Through the analysis of the network efficiency calculated by Q-H method, it can also be drawn that a specific appropriate demand is existed to a given transportation network. Meanwhile, under the fixed demand, both the critical network structure that guarantees the stability and the basic operation of the network and a specific network structure contributing to the largest value of the transportation network efficiency can be identified.
Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien
2013-01-01
An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED BY COMPARING THEIR PREDICTION RESULTS WITH ACTUAL DATA: mean absolute percentage error (MAPE), root mean square error (RMSE), standard error of prediction percentage (SEP), bias factor (Bf), accuracy factor (Af), and absolute fraction of variance (R (2)). Graphical plots were also used for model comparison. The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions.
Gogoshin, Grigoriy; Boerwinkle, Eric; Rodin, Andrei S
2017-04-01
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology-type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types-single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite
Energy Technology Data Exchange (ETDEWEB)
Djukanovic, M.B. [Inst. Nikola Tesla, Belgrade (Yugoslavia). Dept. of Power Systems; Calovic, M.S. [Univ. of Belgrade (Yugoslavia). Dept. of Electrical Engineering; Vesovic, B.V. [Inst. Mihajlo Pupin, Belgrade (Yugoslavia). Dept. of Automatic Control; Sobajic, D.J. [Electric Power Research Inst., Palo Alto, CA (United States)
1997-12-01
This paper presents an attempt of nonlinear, multivariable control of low-head hydropower plants, by using adaptive-network based fuzzy inference system (ANFIS). The new design technique enhances fuzzy controllers with self-learning capability for achieving prescribed control objectives in a near optimal manner. The controller has flexibility for accepting more sensory information, with the main goal to improve the generator unit transients, by adjusting the exciter input, the wicket gate and runner blade positions. The developed ANFIS controller whose control signals are adjusted by using incomplete on-line measurements, can offer better damping effects to generator oscillations over a wide range of operating conditions, than conventional controllers. Digital simulations of hydropower plant equipped with low-head Kaplan turbine are performed and the comparisons of conventional excitation-governor control, state-feedback optimal control and ANFIS based output feedback control are presented. To demonstrate the effectiveness of the proposed control scheme and the robustness of the acquired neuro-fuzzy controller, the controller has been implemented on a complex high-order non-linear hydrogenerator model.
Dewan, Mohammad W.; Huggett, Daniel J.; Liao, T. Warren; Wahab, Muhammad A.; Okeil, Ayman M.
2015-01-01
Friction-stir-welding (FSW) is a solid-state joining process where joint properties are dependent on welding process parameters. In the current study three critical process parameters including spindle speed (??), plunge force (????), and welding speed (??) are considered key factors in the determination of ultimate tensile strength (UTS) of welded aluminum alloy joints. A total of 73 weld schedules were welded and tensile properties were subsequently obtained experimentally. It is observed that all three process parameters have direct influence on UTS of the welded joints. Utilizing experimental data, an optimized adaptive neuro-fuzzy inference system (ANFIS) model has been developed to predict UTS of FSW joints. A total of 1200 models were developed by varying the number of membership functions (MFs), type of MFs, and combination of four input variables (??,??,????,??????) utilizing a MATLAB platform. Note EFI denotes an empirical force index derived from the three process parameters. For comparison, optimized artificial neural network (ANN) models were also developed to predict UTS from FSW process parameters. By comparing ANFIS and ANN predicted results, it was found that optimized ANFIS models provide better results than ANN. This newly developed best ANFIS model could be utilized for prediction of UTS of FSW joints.
Salehi, Mohammad Reza; Noori, Leila; Abiri, Ebrahim
2016-11-01
In this paper, a subsystem consisting of a microstrip bandpass filter and a microstrip low noise amplifier (LNA) is designed for WLAN applications. The proposed filter has a small implementation area (49 mm2), small insertion loss (0.08 dB) and wide fractional bandwidth (FBW) (61%). To design the proposed LNA, the compact microstrip cells, an field effect transistor, and only a lumped capacitor are used. It has a low supply voltage and a low return loss (-40 dB) at the operation frequency. The matching condition of the proposed subsystem is predicted using subsystem analysis, artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). To design the proposed filter, the transmission matrix of the proposed resonator is obtained and analysed. The performance of the proposed ANN and ANFIS models is tested using the numerical data by four performance measures, namely the correlation coefficient (CC), the mean absolute error (MAE), the average percentage error (APE) and the root mean square error (RMSE). The obtained results show that these models are in good agreement with the numerical data, and a small error between the predicted values and numerical solution is obtained.
Energy Technology Data Exchange (ETDEWEB)
Metin Ertunc, H. [Department of Mechatronics Engineering, Kocaeli University, Umuttepe, 41380 Kocaeli (Turkey); Hosoz, Murat [Department of Mechanical Education, Kocaeli University, Umuttepe, 41380 Kocaeli (Turkey)
2008-12-15
This study deals with predicting the performance of an evaporative condenser using both artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) techniques. For this aim, an experimental evaporative condenser consisting of a copper tube condensing coil along with air and water circuit elements was developed and equipped with instruments used for temperature, pressure and flow rate measurements. After the condenser was connected to an R134a vapour-compression refrigeration circuit, it was operated at steady state conditions, while varying both dry and wet bulb temperatures of the air stream entering the condenser, air and water flow rates as well as pressure, temperature and flow rate of the entering refrigerant. Using some of the experimental data for training, ANN and ANFIS models for the evaporative condenser were developed. These models were used for predicting the condenser heat rejection rate, refrigerant temperature leaving the condenser along with dry and wet bulb temperatures of the leaving air stream. Although it was observed that both ANN and ANFIS models yielded a good statistical prediction performance in terms of correlation coefficient, mean relative error, root mean square error and absolute fraction of variance, the accuracies of ANFIS predictions were usually slightly better than those of ANN predictions. This study reveals that, having an extended prediction capability compared to ANN, the ANFIS technique can also be used for predicting the performance of evaporative condensers. (author)
Strong Inference in Mathematical Modeling: A Method for Robust Science in the Twenty-First Century.
Ganusov, Vitaly V
2016-01-01
While there are many opinions on what mathematical modeling in biology is, in essence, modeling is a mathematical tool, like a microscope, which allows consequences to logically follow from a set of assumptions. Only when this tool is applied appropriately, as microscope is used to look at small items, it may allow to understand importance of specific mechanisms/assumptions in biological processes. Mathematical modeling can be less useful or even misleading if used inappropriately, for example, when a microscope is used to study stars. According to some philosophers (Oreskes et al., 1994), the best use of mathematical models is not when a model is used to confirm a hypothesis but rather when a model shows inconsistency of the model (defined by a specific set of assumptions) and data. Following the principle of strong inference for experimental sciences proposed by Platt (1964), I suggest "strong inference in mathematical modeling" as an effective and robust way of using mathematical modeling to understand mechanisms driving dynamics of biological systems. The major steps of strong inference in mathematical modeling are (1) to develop multiple alternative models for the phenomenon in question; (2) to compare the models with available experimental data and to determine which of the models are not consistent with the data; (3) to determine reasons why rejected models failed to explain the data, and (4) to suggest experiments which would allow to discriminate between remaining alternative models. The use of strong inference is likely to provide better robustness of predictions of mathematical models and it should be strongly encouraged in mathematical modeling-based publications in the Twenty-First century.
Directory of Open Access Journals (Sweden)
Yasser Abduallah
2017-01-01
Full Text Available Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs. Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
Abduallah, Yasser; Turki, Turki; Byron, Kevin; Du, Zongxuan; Cervantes-Cervantes, Miguel; Wang, Jason T L
2017-01-01
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
Method and tool for network vulnerability analysis
Swiler, Laura Painton [Albuquerque, NM; Phillips, Cynthia A [Albuquerque, NM
2006-03-14
A computer system analysis tool and method that will allow for qualitative and quantitative assessment of security attributes and vulnerabilities in systems including computer networks. The invention is based on generation of attack graphs wherein each node represents a possible attack state and each edge represents a change in state caused by a single action taken by an attacker or unwitting assistant. Edges are weighted using metrics such as attacker effort, likelihood of attack success, or time to succeed. Generation of an attack graph is accomplished by matching information about attack requirements (specified in "attack templates") to information about computer system configuration (contained in a configuration file that can be updated to reflect system changes occurring during the course of an attack) and assumed attacker capabilities (reflected in "attacker profiles"). High risk attack paths, which correspond to those considered suited to application of attack countermeasures given limited resources for applying countermeasures, are identified by finding "epsilon optimal paths."
Control and estimation methods over communication networks
Mahmoud, Magdi S
2014-01-01
This book provides a rigorous framework in which to study problems in the analysis, stability and design of networked control systems. Four dominant sources of difficulty are considered: packet dropouts, communication bandwidth constraints, parametric uncertainty, and time delays. Past methods and results are reviewed from a contemporary perspective, present trends are examined, and future possibilities proposed. Emphasis is placed on robust and reliable design methods. New control strategies for improving the efficiency of sensor data processing and reducing associated time delay are presented. The coverage provided features: · an overall assessment of recent and current fault-tolerant control algorithms; · treatment of several issues arising at the junction of control and communications; · key concepts followed by their proofs and efficient computational methods for their implementation; and · simulation examples (including TrueTime simulations) to...
MBVCNN: Joint convolutional neural networks method for image recognition
Tong, Tong; Mu, Xiaodong; Zhang, Li; Yi, Zhaoxiang; Hu, Pei
2017-05-01
Aiming at the problem of objects in image recognition rectangle, but objects which are input into convolutional neural networks square, the object recognition model was put forward which was based on BING method to realize object estimate, used vectorization of convolutional neural networks to realize input square image in convolutional networks, therefore, built joint convolution neural networks, which achieve multiple size image input. Verified by experiments, the accuracy of multi-object image recognition was improved by 6.70% compared with single vectorization of convolutional neural networks. Therefore, image recognition method of joint convolutional neural networks can enhance the accuracy in image recognition, especially for target in rectangular shape.
National Research Council Canada - National Science Library
Haitao Zhang; Chenxue Wu; Zewei Chen; Zhao Liu; Yunhong Zhu
2017-01-01
...) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge...
An improved method for network congestion control
Qiao, Xiaolin
2013-03-01
The rapid progress of the wireless network technology has great convenience on the people's life and work. However, because of its openness, the mobility of the terminal and the changing topology, the wireless network is more susceptible to security attacks. Authentication and key agreement is the base of the network security. The authentication and key agreement mechanism can prevent the unauthorized user from accessing the network, resist malicious network to deceive the lawful user, encrypt the session data by using the exchange key and provide the identification of the data origination. Based on characteristics of the wireless network, this paper proposed a key agreement protocol for wireless network. The authentication of protocol is based on Elliptic Curve Cryptosystems and Diffie-Hellman.
Directory of Open Access Journals (Sweden)
Chien-Lin Huang
2015-01-01
Full Text Available This study aims to construct a typhoon precipitation forecast model providing forecasts one to six hours in advance using optimal model parameters and structures retrieved from a combination of the adaptive network-based fuzzy inference system (ANFIS and artificial intelligence. To enhance the accuracy of the precipitation forecast, two structures were then used to establish the precipitation forecast model for a specific lead-time: a single-model structure and a dual-model hybrid structure where the forecast models of higher and lower precipitation were integrated. In order to rapidly, automatically, and accurately retrieve the optimal parameters and structures of the ANFIS-based precipitation forecast model, a tabu search was applied to identify the adjacent radius in subtractive clustering when constructing the ANFIS structure. The coupled structure was also employed to establish a precipitation forecast model across short and long lead-times in order to improve the accuracy of long-term precipitation forecasts. The study area is the Shimen Reservoir, and the analyzed period is from 2001 to 2009. Results showed that the optimal initial ANFIS parameters selected by the tabu search, combined with the dual-model hybrid method and the coupled structure, provided the favors in computation efficiency and high-reliability predictions in typhoon precipitation forecasts regarding short to long lead-time forecasting horizons.
Directory of Open Access Journals (Sweden)
Zhongrong Zhang
2016-01-01
Full Text Available Wind energy has increasingly played a vital role in mitigating conventional resource shortages. Nevertheless, the stochastic nature of wind poses a great challenge when attempting to find an accurate forecasting model for wind power. Therefore, precise wind power forecasts are of primary importance to solve operational, planning and economic problems in the growing wind power scenario. Previous research has focused efforts on the deterministic forecast of wind power values, but less attention has been paid to providing information about wind energy. Based on an optimal Adaptive-Network-Based Fuzzy Inference System (ANFIS and Singular Spectrum Analysis (SSA, this paper develops a hybrid uncertainty forecasting model, IFASF (Interval Forecast-ANFIS-SSA-Firefly Alogorithm, to obtain the upper and lower bounds of daily average wind power, which is beneficial for the practical operation of both the grid company and independent power producers. To strengthen the practical ability of this developed model, this paper presents a comparison between IFASF and other benchmarks, which provides a general reference for this aspect for statistical or artificially intelligent interval forecast methods. The comparison results show that the developed model outperforms eight benchmarks and has a satisfactory forecasting effectiveness in three different wind farms with two time horizons.
Mekanik, F.; Imteaz, M. A.; Talei, A.
2016-05-01
Accurate seasonal rainfall forecasting is an important step in the development of reliable runoff forecast models. The large scale climate modes affecting rainfall in Australia have recently been proven useful in rainfall prediction problems. In this study, adaptive network-based fuzzy inference systems (ANFIS) models are developed for the first time for southeast Australia in order to forecast spring rainfall. The models are applied in east, center and west Victoria as case studies. Large scale climate signals comprising El Nino Southern Oscillation (ENSO), Indian Ocean Dipole (IOD) and Inter-decadal Pacific Ocean (IPO) are selected as rainfall predictors. Eight models are developed based on single climate modes (ENSO, IOD, and IPO) and combined climate modes (ENSO-IPO and ENSO-IOD). Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Pearson correlation coefficient (r) and root mean square error in probability (RMSEP) skill score are used to evaluate the performance of the proposed models. The predictions demonstrate that ANFIS models based on individual IOD index perform superior in terms of RMSE, MAE and r to the models based on individual ENSO indices. It is further discovered that IPO is not an effective predictor for the region and the combined ENSO-IOD and ENSO-IPO predictors did not improve the predictions. In order to evaluate the effectiveness of the proposed models a comparison is conducted between ANFIS models and the conventional Artificial Neural Network (ANN), the Predictive Ocean Atmosphere Model for Australia (POAMA) and climatology forecasts. POAMA is the official dynamic model used by the Australian Bureau of Meteorology. The ANFIS predictions certify a superior performance for most of the region compared to ANN and climatology forecasts. POAMA performs better in regards to RMSE and MAE in east and part of central Victoria, however, compared to ANFIS it shows weaker results in west Victoria in terms of prediction errors and RMSEP skill
Sensor Network Information Analytical Methods: Analysis of Similarities and Differences
Directory of Open Access Journals (Sweden)
Chen Jian
2014-04-01
Full Text Available In the Sensor Network information engineering literature, few references focus on the definition and design of Sensor Network information analytical methods. Among those that do are Munson, et al. and the ISO standards on functional size analysis. To avoid inconsistent vocabulary and potentially incorrect interpretation of data, Sensor Network information analytical methods must be better designed, including definitions, analysis principles, analysis rules, and base units. This paper analyzes the similarities and differences across three different views of analytical methods, and uses a process proposed for the design of Sensor Network information analytical methods to analyze two examples of such methods selected from the literature.
Deza, J. I.; Barreiro, M.; Masoller, C.
2013-06-01
We study global climate networks constructed by means of ordinal time series analysis. Climate interdependencies among the nodes are quantified by the mutual information, computed from time series of monthly-averaged surface air temperature anomalies, and from their symbolic ordinal representation (OP). This analysis allows identifying topological changes in the network when varying the time-interval of the ordinal pattern. We consider intra-season time-intervals (e.g., the patterns are formed by anomalies in consecutive months) and inter-annual time-intervals (e.g., the patterns are formed by anomalies in consecutive years). We discuss how the network density and topology change with these time scales, and provide evidence of correlations between geographically distant regions that occur at specific time scales. In particular, we find that an increase in the ordinal pattern spacing (i.e., an increase in the timescale of the ordinal analysis), results in climate networks with increased connectivity on the equatorial Pacific area. On the contrary, the number of significant links decreases when the ordinal analysis is done with a shorter timescale (by comparing consecutive months), and interpret this effect as due to more stochasticity in the time-series in the short timescale. As the equatorial Pacific is known to be dominated by El Niño-Southern Oscillation (ENSO) on scales longer than several months, our methodology allows constructing climate networks where the effect of ENSO goes from mild (monthly OP) to intense (yearly OP), independently of the length of the ordinal pattern and of the thresholding method employed.
Dynamic analysis of biochemical network using complex network method
Directory of Open Access Journals (Sweden)
Wang Shuqiang
2015-01-01
Full Text Available In this study, the stochastic biochemical reaction model is proposed based on the law of mass action and complex network theory. The dynamics of biochemical reaction system is presented as a set of non-linear differential equations and analyzed at the molecular-scale. Given the initial state and the evolution rules of the biochemical reaction system, the system can achieve homeostasis. Compared with random graph, the biochemical reaction network has larger information capacity and is more efficient in information transmission. This is consistent with theory of evolution.
Modern Community Detection Methods in Social Networks
Directory of Open Access Journals (Sweden)
V. O. Chesnokov
2017-01-01
Full Text Available Social network structure is not homogeneous. Groups of vertices which have a lot of links between them are called communities. A survey of algorithms discovering such groups is presented in the article.A popular approach to community detection is to use an graph clustering algorithm. Methods based on inner metric optimization are common. 5 groups of algorithms are listed: based on optimization, joining vertices into clusters by some closeness measure, special subgraphs discovery, partitioning graph by deleting edges, and based on a dynamic process or generative model.Overlapping community detection algorithms are usually just modified graph clustering algorithms. Other approaches do exist, e.g. ones based on edges clustering or constructing communities around randomly chosen vertices. Methods based on nonnegative matrix factorization are also used, but they have high computational complexity. Algorithms based on label propagation lack this disadvantage. Methods based on affiliation model are perspective. This model claims that communities define the structure of a graph.Algorithms which use node attributes are considered: ones based on latent Dirichlet allocation, initially used for text clustering, and CODICIL, where edges of node content relevance are added to the original edge set. 6 classes are listed for algorithms for graphs with node attributes: changing egdes’ weights, changing vertex distance function, building augmented graph with nodes and attributes, based on stochastic models, partitioning attribute space and others.Overlapping community detection algorithms which effectively use node attributes are just started to appear. Methods based on partitioning attribute space, latent Dirichlet allocation, stochastic models and nonnegative matrix factorization are considered. The most effective algorithm on real datasets is CESNA. It is based on affiliation model. However, it gives results which are far from ground truth
Inferring synaptic structure in presence of neural interaction time scales.
Directory of Open Access Journals (Sweden)
Cristiano Capone
Full Text Available Biological networks display a variety of activity patterns reflecting a web of interactions that is complex both in space and time. Yet inference methods have mainly focused on reconstructing, from the network's activity, the spatial structure, by assuming equilibrium conditions or, more recently, a probabilistic dynamics with a single arbitrary time-step. Here we show that, under this latter assumption, the inference procedure fails to reconstruct the synaptic matrix of a network of integrate-and-fire neurons when the chosen time scale of interaction does not closely match the synaptic delay or when no single time scale for the interaction can be identified; such failure, moreover, exposes a distinctive bias of the inference method that can lead to infer as inhibitory the excitatory synapses with interaction time scales longer than the model's time-step. We therefore introduce a new two-step method, that first infers through cross-correlation profiles the delay-structure of the network and then reconstructs the synaptic matrix, and successfully test it on networks with different topologies and in different activity regimes. Although step one is able to accurately recover the delay-structure of the network, thus getting rid of any a priori guess about the time scales of the interaction, the inference method introduces nonetheless an arbitrary time scale, the time-bin dt used to binarize the spike trains. We therefore analytically and numerically study how the choice of dt affects the inference in our network model, finding that the relationship between the inferred couplings and the real synaptic efficacies, albeit being quadratic in both cases, depends critically on dt for the excitatory synapses only, whilst being basically independent of it for the inhibitory ones.
Improved security monitoring method for network bordary
Gao, Liting; Wang, Lixia; Wang, Zhenyan; Qi, Aihua
2013-03-01
This paper proposes a network bordary security monitoring system based on PKI. The design uses multiple safe technologies, analysis deeply the association between network data flow and system log, it can detect the intrusion activities and position invasion source accurately in time. The experiment result shows that it can reduce the rate of false alarm or missing alarm of the security incident effectively.
An effective method for network module extraction from microarray data
Directory of Open Access Journals (Sweden)
Mahanta Priyakshi
2012-08-01
Full Text Available Abstract Background The development of high-throughput Microarray technologies has provided various opportunities to systematically characterize diverse types of computational biological networks. Co-expression network have become popular in the analysis of microarray data, such as for detecting functional gene modules. Results This paper presents a method to build a co-expression network (CEN and to detect network modules from the built network. We use an effective gene expression similarity measure called NMRS (Normalized mean residue similarity to construct the CEN. We have tested our method on five publicly available benchmark microarray datasets. The network modules extracted by our algorithm have been biologically validated in terms of Q value and p value. Conclusions Our results show that the technique is capable of detecting biologically significant network modules from the co-expression network. Biologist can use this technique to find groups of genes with similar functionality based on their expression information.
A Method for Upper Bounding on Network Access Speed
DEFF Research Database (Denmark)
Knudsen, Thomas Phillip; Patel, A.; Pedersen, Jens Myrup
2004-01-01
This paper presents a method for calculating an upper bound on network access speed growth and gives guidelines for further research experiments and simulations. The method is aimed at providing a basis for simulation of long term network development and resource management.......This paper presents a method for calculating an upper bound on network access speed growth and gives guidelines for further research experiments and simulations. The method is aimed at providing a basis for simulation of long term network development and resource management....
A novel community detection method in bipartite networks
Zhou, Cangqi; Feng, Liang; Zhao, Qianchuan
2018-02-01
Community structure is a common and important feature in many complex networks, including bipartite networks, which are used as a standard model for many empirical networks comprised of two types of nodes. In this paper, we propose a two-stage method for detecting community structure in bipartite networks. Firstly, we extend the widely-used Louvain algorithm to bipartite networks. The effectiveness and efficiency of the Louvain algorithm have been proved by many applications. However, there lacks a Louvain-like algorithm specially modified for bipartite networks. Based on bipartite modularity, a measure that extends unipartite modularity and that quantifies the strength of partitions in bipartite networks, we fill the gap by developing the Bi-Louvain algorithm that iteratively groups the nodes in each part by turns. This algorithm in bipartite networks often produces a balanced network structure with equal numbers of two types of nodes. Secondly, for the balanced network yielded by the first algorithm, we use an agglomerative clustering method to further cluster the network. We demonstrate that the calculation of the gain of modularity of each aggregation, and the operation of joining two communities can be compactly calculated by matrix operations for all pairs of communities simultaneously. At last, a complete hierarchical community structure is unfolded. We apply our method to two benchmark data sets and a large-scale data set from an e-commerce company, showing that it effectively identifies community structure in bipartite networks.
Miyashita, Naoyuki; Yonezawa, Yasushige
2017-09-01
Robust and reliable analyses of long trajectories from molecular dynamics simulations are important for investigations of functions and mechanisms of proteins. Structural fitting is necessary for various analyses of protein dynamics, thus removing time-dependent translational and rotational movements. However, the fitting is often difficult for highly flexible molecules. Thus, to address the issues, we proposed a fitting algorithm that uses the Bayesian inference method in combination with rotational fitting-weight improvements, and the well-studied globular protein systems trpcage and lysozyme were used for investigations. The present method clearly identified rigid core regions that fluctuate less than other regions and also separated core regions from highly fluctuating regions with greater accuracy than conventional methods. Our method also provided simultaneous variance-covariance matrix elements composed of atomic coordinates, allowing us to perform principle component analysis and prepare domain cross-correlation map during molecular dynamics simulations in an on-the-fly manner.
Inferring biochemical reaction pathways: the case of the gemcitabine pharmacokinetics.
Lecca, Paola; Morpurgo, Daniele; Fantaccini, Gianluca; Casagrande, Alessandro; Priami, Corrado
2012-05-28
The representation of a biochemical system as a network is the precursor of any mathematical model of the processes driving the dynamics of that system. Pharmacokinetics uses mathematical models to describe the interactions between drug, and drug metabolites and targets and through the simulation of these models predicts drug levels and/or dynamic behaviors of drug entities in the body. Therefore, the development of computational techniques for inferring the interaction network of the drug entities and its kinetic parameters from observational data is raising great interest in the scientific community of pharmacologists. In fact, the network inference is a set of mathematical procedures deducing the structure of a model from the experimental data associated to the nodes of the network of interactions. In this paper, we deal with the inference of a pharmacokinetic network from the concentrations of the drug and its metabolites observed at discrete time points. The method of network inference presented in this paper is inspired by the theory of time-lagged correlation inference with regard to the deduction of the interaction network, and on a maximum likelihood approach with regard to the estimation of the kinetic parameters of the network. Both network inference and parameter estimation have been designed specifically to identify systems of biotransformations, at the biochemical level, from noisy time-resolved experimental data. We use our inference method to deduce the metabolic pathway of the gemcitabine. The inputs to our inference algorithm are the experimental time series of the concentration of gemcitabine and its metabolites. The output is the set of reactions of the metabolic network of the gemcitabine. Time-lagged correlation based inference pairs up to a probabilistic model of parameter inference from metabolites time series allows the identification of the microscopic pharmacokinetics and pharmacodynamics of a drug with a minimal a priori knowledge. In
Inferring biochemical reaction pathways: the case of the gemcitabine pharmacokinetics
Directory of Open Access Journals (Sweden)
Lecca Paola
2012-05-01
Full Text Available Abstract Background The representation of a biochemical system as a network is the precursor of any mathematical model of the processes driving the dynamics of that system. Pharmacokinetics uses mathematical models to describe the interactions between drug, and drug metabolites and targets and through the simulation of these models predicts drug levels and/or dynamic behaviors of drug entities in the body. Therefore, the development of computational techniques for inferring the interaction network of the drug entities and its kinetic parameters from observational data is raising great interest in the scientific community of pharmacologists. In fact, the network inference is a set of mathematical procedures deducing the structure of a model from the experimental data associated to the nodes of the network of interactions. In this paper, we deal with the inference of a pharmacokinetic network from the concentrations of the drug and its metabolites observed at discrete time points. Results The method of network inference presented in this paper is inspired by the theory of time-lagged correlation inference with regard to the deduction of the interaction network, and on a maximum likelihood approach with regard to the estimation of the kinetic parameters of the network. Both network inference and parameter estimation have been designed specifically to identify systems of biotransformations, at the biochemical level, from noisy time-resolved experimental data. We use our inference method to deduce the metabolic pathway of the gemcitabine. The inputs to our inference algorithm are the experimental time series of the concentration of gemcitabine and its metabolites. The output is the set of reactions of the metabolic network of the gemcitabine. Conclusions Time-lagged correlation based inference pairs up to a probabilistic model of parameter inference from metabolites time series allows the identification of the microscopic pharmacokinetics and
A novel method for inferring RFID tag reader recordings into clinical events.
Chang, Yung-Ting; Syed-Abdul, Shabbir; Tsai, Chung-You; Li, Yu-Chuan
2011-12-01
Nosocomial infections (NIs) are among the important indicators used for evaluating patients' safety and hospital performance during accreditation of hospitals. NI rate is higher in Intensive Care Units (ICUs) than in the general wards because patients require intense care involving both invasive and non-invasive clinical procedures. The emergence of Superbugs is motivating health providers to enhance infection control measures. Contact behavior between health caregivers and patients is one of the main causes of cross infections. In this technology driven era remote monitoring of patients and caregivers in the hospital setting can be performed reliably, and thus is in demand. Proximity sensing using radio frequency identification (RFID) technology can be helpful in capturing and keeping track on all contact history between health caregivers and patients for example. This study intended to extend the use of proximity sensing of radio frequency identification technology by proposing a model for inferring RFID tag reader recordings into clinical events. The aims of the study are twofold. The first aim is to set up a Contact History Inferential Model (CHIM) between health caregivers and patients. The second is to verify CHIM with real-time observation done at the ICU ward. A pre-study was conducted followed by two study phases. During the pre-study proximity sensing of RFID was tested, and deployment of the RFID in the Clinical Skill Center in one of the medical centers in Taiwan was done. We simulated clinical events and developed CHIM using variables such as duration of time, frequency, and identity (tag) numbers assigned to caregivers. All clinical proximity events are classified into close-in events, contact events and invasive events. During the first phase three observers were recruited to do real time recordings of all clinical events in the Clinical Skill Center with the deployed automated RFID interaction recording system. The observations were used to verify
Efficient Optimization Methods for Communication Network Planning and Assessment
Kiese, Moritz
2010-01-01
In this work, we develop efficient mathematical planning methods to design communication networks. First, we examine future technologies for optical backbone networks. As new, more intelligent nodes cause higher dynamics in the transport networks, fast planning methods are required. To this end, we develop a heuristic planning algorithm. The evaluation of the cost-efficiency of new, adapative transmission techniques comprises the second topic of this section. In the second part of this work, ...
Approximation and inference methods for stochastic biochemical kinetics - a tutorial review
Schnoerr, David; Sanguinetti, Guido; Grima, Ramon
2017-01-01
Stochastic fluctuations of molecule numbers are ubiquitous in biological systems. Important examples include gene expression and enzymatic processes in living cells. Such systems are typically modelled as chemical reaction networks whose dynamics are governed by the Chemical Master Equation. Despite its simple structure, no analytic solutions to the Chemical Master Equation are known for most systems. Moreover, stochastic simulations are computationally expensive, making systematic analysis a...
Kang, Jin Kyu; Hong, Hyung Gil; Park, Kang Ryoung
2017-07-08
A number of studies have been conducted to enhance the pedestrian detection accuracy of intelligent surveillance systems. However, detecting pedestrians under outdoor conditions is a challenging problem due to the varying lighting, shadows, and occlusions. In recent times, a growing number of studies have been performed on visible light camera-based pedestrian detection systems using a convolutional neural network (CNN) in order to make the pedestrian detection process more resilient to such conditions. However, visible light cameras still cannot detect pedestrians during nighttime, and are easily affected by shadows and lighting. There are many studies on CNN-based pedestrian detection through the use of far-infrared (FIR) light cameras (i.e., thermal cameras) to address such difficulties. However, when the solar radiation increases and the background temperature reaches the same level as the body temperature, it remains difficult for the FIR light camera to detect pedestrians due to the insignificant difference between the pedestrian and non-pedestrian features within the images. Researchers have been trying to solve this issue by inputting both the visible light and the FIR camera images into the CNN as the input. This, however, takes a longer time to process, and makes the system structure more complex as the CNN needs to process both camera images. This research adaptively selects a more appropriate candidate between two pedestrian images from visible light and FIR cameras based on a fuzzy inference system (FIS), and the selected candidate is verified with a CNN. Three types of databases were tested, taking into account various environmental factors using visible light and FIR cameras. The results showed that the proposed method performs better than the previously reported methods.
A method for under-sampled ecological network data analysis: plant-pollination as case study
Directory of Open Access Journals (Sweden)
Peter B. Sorensen
2012-01-01
Full Text Available In this paper, we develop a method, termed the Interaction Distribution (ID method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1, pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2, qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.
Multilevel method for modeling large-scale networks.
Energy Technology Data Exchange (ETDEWEB)
Safro, I. M. (Mathematics and Computer Science)
2012-02-24
Understanding the behavior of real complex networks is of great theoretical and practical significance. It includes developing accurate artificial models whose topological properties are similar to the real networks, generating the artificial networks at different scales under special conditions, investigating a network dynamics, reconstructing missing data, predicting network response, detecting anomalies and other tasks. Network generation, reconstruction, and prediction of its future topology are central issues of this field. In this project, we address the questions related to the understanding of the network modeling, investigating its structure and properties, and generating artificial networks. Most of the modern network generation methods are based either on various random graph models (reinforced by a set of properties such as power law distribution of node degrees, graph diameter, and number of triangles) or on the principle of replicating an existing model with elements of randomization such as R-MAT generator and Kronecker product modeling. Hierarchical models operate at different levels of network hierarchy but with the same finest elements of the network. However, in many cases the methods that include randomization and replication elements on the finest relationships between network nodes and modeling that addresses the problem of preserving a set of simplified properties do not fit accurately enough the real networks. Among the unsatisfactory features are numerically inadequate results, non-stability of algorithms on real (artificial) data, that have been tested on artificial (real) data, and incorrect behavior at different scales. One reason is that randomization and replication of existing structures can create conflicts between fine and coarse scales of the real network geometry. Moreover, the randomization and satisfying of some attribute at the same time can abolish those topological attributes that have been undefined or hidden from
A Method for Automated Planning of FTTH Access Network Infrastructures
DEFF Research Database (Denmark)
Riaz, Muhammad Tahir; Pedersen, Jens Myrup; Madsen, Ole Brun
2005-01-01
In this paper a method for automated planning of Fiber to the Home (FTTH) access networks is proposed. We introduced a systematic approach for planning access network infrastructure. The GIS data and a set of algorithms were employed to make the planning process more automatic. The method explains...
DETECTING NETWORK ATTACKS IN COMPUTER NETWORKS BY USING DATA MINING METHODS
Platonov, V. V.; Semenov, P. O.
2016-01-01
The article describes an approach to the development of an intrusion detection system for computer networks. It is shown that the usage of several data mining methods and tools can improve the efficiency of protection computer networks against network at-tacks due to the combination of the benefits of signature detection and anomalies detection and the opportunity of adaptation the sys-tem for hardware and software structure of the computer network.
Directory of Open Access Journals (Sweden)
Zohdy Sarah
2012-03-01
provided insight into the previously unseen parasite movement between lemurs, but also allowed us to infer social interactions between them. As lice are known pathogen vectors, our method also allowed us to identify the lemurs most likely to facilitate louse-mediated epidemics. Our approach demonstrates the potential to uncover otherwise inaccessible parasite-host, and host social interaction data in any trappable species parasitized by sucking lice.
Anomaly-based Network Intrusion Detection Methods
Directory of Open Access Journals (Sweden)
Pavel Nevlud
2013-01-01
Full Text Available The article deals with detection of network anomalies. Network anomalies include everything that is quite different from the normal operation. For detection of anomalies were used machine learning systems. Machine learning can be considered as a support or a limited type of artificial intelligence. A machine learning system usually starts with some knowledge and a corresponding knowledge organization so that it can interpret, analyse, and test the knowledge acquired. There are several machine learning techniques available. We tested Decision tree learning and Bayesian networks. The open source data-mining framework WEKA was the tool we used for testing the classify, cluster, association algorithms and for visualization of our results. The WEKA is a collection of machine learning algorithms for data mining tasks.
Mean field methods for cortical network dynamics
DEFF Research Database (Denmark)
Hertz, J.; Lerchner, Alexander; Ahmadi, M.
2004-01-01
We review the use of mean field theory for describing the dynamics of dense, randomly connected cortical circuits. For a simple network of excitatory and inhibitory leaky integrate- and-fire neurons, we can show how the firing irregularity, as measured by the Fano factor, increases...... with the strength of the synapses in the network and with the value to which the membrane potential is reset after a spike. Generalizing the model to include conductance-based synapses gives insight into the connection between the firing statistics and the high- conductance state observed experimentally in visual...
A combined evidence Bayesian method for human ancestry inference applied to Afro-Colombians.
Rishishwar, Lavanya; Conley, Andrew B; Vidakovic, Brani; Jordan, I King
2015-12-15
Uniparental genetic markers, mitochondrial DNA (mtDNA) and Y chromosomal DNA, are widely used for the inference of human ancestry. However, the resolution of ancestral origins based on mtDNA haplotypes is limited by the fact that such haplotypes are often found to be distributed across wide geographical regions. We have addressed this issue here by combining two sources of ancestry information that have typically been considered separately: historical records regarding population origins and genetic information on mtDNA haplotypes. To combine these distinct data sources, we applied a Bayesian approach that considers historical records, in the form of prior probabilities, together with data on the geographical distribution of mtDNA haplotypes, formulated as likelihoods, to yield ancestry assignments from posterior probabilities. This combined evidence Bayesian approach to ancestry assignment was evaluated for its ability to accurately assign sub-continental African ancestral origins to Afro-Colombians based on their mtDNA haplotypes. We demonstrate that the incorporation of historical prior probabilities via this analytical framework can provide for substantially increased resolution in sub-continental African ancestry assignment for members of this population. In addition, a personalized approach to ancestry assignment that involves the tuning of priors to individual mtDNA haplotypes yields even greater resolution for individual ancestry assignment. Despite the fact that Colombia has a large population of Afro-descendants, the ancestry of this community has been understudied relative to populations with primarily European and Native American ancestry. Thus, the application of the kind of combined evidence approach developed here to the study of ancestry in the Afro-Colombian population has the potential to be impactful. The formal Bayesian analytical framework we propose for combining historical and genetic information also has the potential to be widely applied
Gaebler, P. J.; Ceranna, L.
2016-12-01
All nuclear explosions - on the Earth's surface, underground, underwater or in the atmosphere - are banned by the Comprehensive Nuclear-Test-Ban Treaty (CTBT). As part of this treaty, a verification regime was put into place to detect, locate and characterize nuclear explosion testings at any time, by anyone and everywhere on the Earth. The International Monitoring System (IMS) plays a key role in the verification regime of the CTBT. Out of the different monitoring techniques used in the IMS, the seismic waveform approach is the most effective technology for monitoring nuclear underground testing and to identify and characterize potential nuclear events. This study introduces a method of seismic threshold monitoring to assess an upper magnitude limit of a potential seismic event in a certain given geographical region. The method is based on ambient seismic background noise measurements at the individual IMS seismic stations as well as on global distance correction terms for body wave magnitudes, which are calculated using the seismic reflectivity method. From our investigations we conclude that a global detection threshold of around mb 4.0 can be achieved using only stations from the primary seismic network, a clear latitudinal dependence for the detection thresholdcan be observed between northern and southern hemisphere. Including the seismic stations being part of the auxiliary seismic IMS network results in a slight improvement of global detection capability. However, including wave arrivals from distances greater than 120 degrees, mainly PKP-wave arrivals, leads to a significant improvement in average global detection capability. In special this leads to an improvement of the detection threshold on the southern hemisphere. We further investigate the dependence of the detection capability on spatial (latitude and longitude) and temporal (time) parameters, as well as on parameters such as source type and percentage of operational IMS stations.
Directory of Open Access Journals (Sweden)
David Metzgar
Full Text Available BACKGROUND: Group A Streptococcus pyogenes (GAS exhibits a high degree of clinically relevant phenotypic diversity. Strains vary widely in terms of antibiotic resistance (AbR, clinical severity, and transmission rate. Currently, strain identification is achieved by emm typing (direct sequencing of the genomic segment coding for the antigenic portion of the M protein or by multilocus genotyping methods. Phenotype analysis, including critical AbR typing, is generally achieved by much slower and more laborious direct culture-based methods. METHODOLOGY/PRINCIPAL FINDINGS: We compare genotype identification (by emm typing and PCR/ESI-MS with directly measured phenotypes (AbR and outbreak associations for 802 clinical isolates of GAS collected from symptomatic patients over a period of 6 years at 10 military facilities in the United States. All independent strain characterization methods are highly correlated. This shows that recombination, horizontal transfer, and other forms of reassortment are rare in GAS insofar as housekeeping genes, primary virulence and antibiotic resistance determinants, and the emm gene are concerned. Therefore, genotyping methods offer an efficient way to predict emm type and the associated AbR and virulence phenotypes. CONCLUSIONS/SIGNIFICANCE: The data presented here, combined with much historical data, suggest that emm typing assays and faster molecular methods that infer emm type from genomic signatures could be used to efficiently infer critical phenotypic characteristics based on robust genotype: phenotype correlations. This, in turn, would enable faster and better-targeted responses during identified outbreaks of constitutively resistant or particularly virulent emm types.
Kim, Daesang
2016-01-06
A new Bayesian inference method has been developed and applied to Furan shock tube experimental data for efficient statistical inferences of the Arrhenius parameters of two OH radical consumption reactions. The collected experimental data, which consist of time series signals of OH radical concentrations of 14 shock tube experiments, may require several days for MCMC computations even with the support of a fast surrogate of the combustion simulation model, while the new method reduces it to several hours by splitting the process into two steps of MCMC: the first inference of rate constants and the second inference of the Arrhenius parameters. Each step has low dimensional parameter spaces and the second step does not need the executions of the combustion simulation. Furthermore, the new approach has more flexibility in choosing the ranges of the inference parameters, and the higher speed and flexibility enable the more accurate inferences and the analyses of the propagation of errors in the measured temperatures and the alignment of the experimental time to the inference results.
A simulation-based evaluation of methods for inferring linear barriers to gene flow
Christopher Blair; Dana E. Weigel; Matthew Balazik; Annika T. H. Keeley; Faith M. Walker; Erin Landguth; Sam Cushman; Melanie Murphy; Lisette Waits; Niko Balkenhol
2012-01-01
Different analytical techniques used on the same data set may lead to different conclusions about the existence and strength of genetic structure. Therefore, reliable interpretation of the results from different methods depends on the efficacy and reliability of different statistical methods. In this paper, we evaluated the performance of multiple analytical methods to...
An image segmentation method based on network clustering model
Jiao, Yang; Wu, Jianshe; Jiao, Licheng
2018-01-01
Network clustering phenomena are ubiquitous in nature and human society. In this paper, a method involving a network clustering model is proposed for mass segmentation in mammograms. First, the watershed transform is used to divide an image into regions, and features of the image are computed. Then a graph is constructed from the obtained regions and features. The network clustering model is applied to realize clustering of nodes in the graph. Compared with two classic methods, the algorithm based on the network clustering model performs more effectively in experiments.
Mixed Methods Analysis of Enterprise Social Networks
DEFF Research Database (Denmark)
Behrendt, Sebastian; Richter, Alexander; Trier, Matthias
2014-01-01
The increasing use of enterprise social networks (ESN) generates vast amounts of data, giving researchers and managerial decision makers unprecedented opportunities for analysis. However, more transparency about the available data dimensions and how these can be combined is needed to yield accurate...
Dynamic baseline detection method for power data network service
Chen, Wei
2017-08-01
This paper proposes a dynamic baseline Traffic detection Method which is based on the historical traffic data for the Power data network. The method uses Cisco's NetFlow acquisition tool to collect the original historical traffic data from network element at fixed intervals. This method uses three dimensions information including the communication port, time, traffic (number of bytes or number of packets) t. By filtering, removing the deviation value, calculating the dynamic baseline value, comparing the actual value with the baseline value, the method can detect whether the current network traffic is abnormal.
A new method for constructing networks from binary data
van Borkulo, Claudia D.; Borsboom, Denny; Epskamp, Sacha; Blanken, Tessa F.; Boschloo, Lynn; Schoevers, Robert A.; Waldorp, Lourens J.
2014-08-01
Network analysis is entering fields where network structures are unknown, such as psychology and the educational sciences. A crucial step in the application of network models lies in the assessment of network structure. Current methods either have serious drawbacks or are only suitable for Gaussian data. In the present paper, we present a method for assessing network structures from binary data. Although models for binary data are infamous for their computational intractability, we present a computationally efficient model for estimating network structures. The approach, which is based on Ising models as used in physics, combines logistic regression with model selection based on a Goodness-of-Fit measure to identify relevant relationships between variables that define connections in a network. A validation study shows that this method succeeds in revealing the most relevant features of a network for realistic sample sizes. We apply our proposed method to estimate the network of depression and anxiety symptoms from symptom scores of 1108 subjects. Possible extensions of the model are discussed.
The research on user behavior evaluation method for network state
Zhang, Chengyuan; Xu, Haishui
2017-08-01
Based on the correlation between user behavior and network running state, this paper proposes a method of user behavior evaluation based on network state. Based on the analysis and evaluation methods in other fields of study, we introduce the theory and tools of data mining. Based on the network status information provided by the trusted network view, the user behavior data and the network state data are analysed. Finally, we construct the user behavior evaluation index and weight, and on this basis, we can accurately quantify the influence degree of the specific behavior of different users on the change of network running state, so as to provide the basis for user behavior control decision.
Evolutionary method for finding communities in bipartite networks
Zhan, Weihua; Zhang, Zhongzhi; Guan, Jihong; Zhou, Shuigeng
2011-06-01
An important step in unveiling the relation between network structure and dynamics defined on networks is to detect communities, and numerous methods have been developed separately to identify community structure in different classes of networks, such as unipartite networks, bipartite networks, and directed networks. Here, we show that the finding of communities in such networks can be unified in a general framework—detection of community structure in bipartite networks. Moreover, we propose an evolutionary method for efficiently identifying communities in bipartite networks. To this end, we show that both unipartite and directed networks can be represented as bipartite networks, and their modularity is completely consistent with that for bipartite networks, the detection of modular structure on which can be reformulated as modularity maximization. To optimize the bipartite modularity, we develop a modified adaptive genetic algorithm (MAGA), which is shown to be especially efficient for community structure detection. The high efficiency of the MAGA is based on the following three improvements we make. First, we introduce a different measure for the informativeness of a locus instead of the standard deviation, which can exactly determine which loci mutate. This measure is the bias between the distribution of a locus over the current population and the uniform distribution of the locus, i.e., the Kullback-Leibler divergence between them. Second, we develop a reassignment technique for differentiating the informative state a locus has attained from the random state in the initial phase. Third, we present a modified mutation rule which by incorporating related operations can guarantee the convergence of the MAGA to the global optimum and can speed up the convergence process. Experimental results show that the MAGA outperforms existing methods in terms of modularity for both bipartite and unipartite networks.
Bae, Jonghoon; Cha, Young-Jae; Lee, Hyungsuk; Lee, Boyun; Baek, Sojung; Choi, Semin; Jang, Dayk
2017-01-01
This study examines whether the way that a person makes inferences about unknown events is associated with his or her social relations, more precisely, those characterized by ego network density that reflects the structure of a person?s immediate social relation. From the analysis of individual predictions over the Go match between AlphaGo and Sedol Lee in March 2016 in Seoul, Korea, this study shows that the low-density group scored higher than the high-density group in the accuracy of the p...
Identifying the multiple dysregulated oncoproteins that contribute to tumorigenesis in a given patient is crucial for developing personalized treatment plans. However, accurate inference of aberrant protein activity in biological samples is still challenging as genetic alterations are only partially predictive and direct measurements of protein activity are generally not feasible.
Reduction Method for Active Distribution Networks
DEFF Research Database (Denmark)
Raboni, Pietro; Chen, Zhe
2013-01-01
On-line security assessment is traditionally performed by Transmission System Operators at the transmission level, ignoring the effective response of distributed generators and small loads. On the other hand the required computation time and amount of real time data for including Distribution Net...... by comparing the results obtained in PSCAD® with the detailed network model and with the reduced one. Moreover the control schemes of a wind turbine and a photovoltaic plant included in the detailed network model are described.......On-line security assessment is traditionally performed by Transmission System Operators at the transmission level, ignoring the effective response of distributed generators and small loads. On the other hand the required computation time and amount of real time data for including Distribution...
Classification Method in Integrated Information Network Using Vector Image Comparison
Directory of Open Access Journals (Sweden)
Zhou Yuan
2014-05-01
Full Text Available Wireless Integrated Information Network (WMN consists of integrated information that can get data from its surrounding, such as image, voice. To transmit information, large resource is required which decreases the service time of the network. In this paper we present a Classification Approach based on Vector Image Comparison (VIC for WMN that improve the service time of the network. The available methods for sub-region selection and conversion are also proposed.
Spectral Methods for Immunization of Large Networks
Directory of Open Access Journals (Sweden)
Muhammad Ahmad
2017-11-01
Full Text Available Given a network of nodes, minimizing the spread of a contagion using a limited budget is a well-studied problem with applications in network security, viral marketing, social networks, and public health. In real graphs, virus may infect a node which in turn infects its neighbour nodes and this may trigger an epidemic in the whole graph. The goal thus is to select the best k nodes (budget constraint that are immunized (vaccinated, screened, filtered so as the remaining graph is less prone to the epidemic. It is known that the problem is, in all practical models, computationally intractable even for moderate sized graphs. In this paper we employ ideas from spectral graph theory to define relevance and importance of nodes. Using novel graph theoretic techniques, we then design an efficient approximation algorithm to immunize the graph. Theoretical guarantees on the running time of our algorithm show that it is more efficient than any other known solution in the literature. We test the performance of our algorithm on several real world graphs. Experiments show that our algorithm scales well for large graphs and outperforms state of the art algorithms both in quality (containment of epidemic and efficiency (runtime and space complexity.
Semigroup methods for evolution equations on networks
Mugnolo, Delio
2014-01-01
This concise text is based on a series of lectures held only a few years ago and originally intended as an introduction to known results on linear hyperbolic and parabolic equations. Yet the topic of differential equations on graphs, ramified spaces, and more general network-like objects has recently gained significant momentum and, well beyond the confines of mathematics, there is a lively interdisciplinary discourse on all aspects of so-called complex networks. Such network-like structures can be found in virtually all branches of science, engineering and the humanities, and future research thus calls for solid theoretical foundations. This book is specifically devoted to the study of evolution equations – i.e., of time-dependent differential equations such as the heat equation, the wave equation, or the Schrödinger equation (quantum graphs) – bearing in mind that the majority of the literature in the last ten years on the subject of differential equations of graphs has been devoted to ellip...
Diagrammatic perturbation methods in networks and sports ranking combinatorics
Park, Juyong
2010-04-01
Analytic and computational tools developed in statistical physics are being increasingly applied to the study of complex networks. Here we present recent developments in the diagrammatic perturbation methods for the exponential random graph models, and apply them to the combinatoric problem of determining the ranking of nodes in directed networks that represent pairwise competitions.
Directory of Open Access Journals (Sweden)
Alaoui Youssef Lamrani
2017-12-01
Full Text Available Managing operational risk efficiently is a critical factor of microfinance institutions (MFIs to get a financial and social return. The purpose of this paper is to identify, assess and prioritize the root causes of failure within the microfinance lending process (MLP especially in Moroccan microfinance institutions. Considering the limitation of traditional failure mode and effect analysis (FMEA method in assessing and classifying risks, the methodology adopted in this study focuses on developing a fuzzy logic inference system (FLIS based on (FMEA. This approach can take into account the subjectivity of risk indicators and the insufficiency of statistical data. The results show that the Moroccan MFIs need to focus more on customer relationship management and give more importance to their staff training, to clients screening as well as to their business analysis.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Łęski, Szymon; Pettersen, Klas H; Tunstall, Beth; Einevoll, Gaute T; Gigg, John; Wójcik, Daniel K
2011-12-01
The recent development of large multielectrode recording arrays has made it affordable for an increasing number of laboratories to record from multiple brain regions simultaneously. The development of analytical tools for array data, however, lags behind these technological advances in hardware. In this paper, we present a method based on forward modeling for estimating current source density from electrophysiological signals recorded on a two-dimensional grid using multi-electrode rectangular arrays. This new method, which we call two-dimensional inverse Current Source Density (iCSD 2D), is based upon and extends our previous one- and three-dimensional techniques. We test several variants of our method, both on surrogate data generated from a collection of Gaussian sources, and on model data from a population of layer 5 neocortical pyramidal neurons. We also apply the method to experimental data from the rat subiculum. The main advantages of the proposed method are the explicit specification of its assumptions, the possibility to include system-specific information as it becomes available, the ability to estimate CSD at the grid boundaries, and lower reconstruction errors when compared to the traditional approach. These features make iCSD 2D a substantial improvement over the approaches used so far and a powerful new tool for the analysis of multielectrode array data. We also provide a free GUI-based MATLAB toolbox to analyze and visualize our test data as well as user datasets.
Comparison of two methods for inferring total columnar ozone amount and aerosol optical depth
Martinez-Lozano, Jose A.; Utrillas, M. P.; Tena, Fernando; Cachorro, Victoria E.
1995-12-01
Mean daily values of the total atmospheric optical depth have been obtained from measurements of spectral solar irradiance at ground level in Valencia, Spain. These measurements have been taken during ten days in the years 1993 and 1994. The total columnar ozone amount and aerosol optical depths have been calculated using both King and Byrne and by Flittner at al. methods. The results obtained show that these algorithms lead to big errors if they are employed to determine instantaneous values of total ozone content. If they are used to calculate mean daily values, both methods give similar results either for the total ozone content or the aerosol optical depth, with quite acceptable errors. Considering the errors introduced by any one of the two methods, King's algorithm leads to higher imprecision in the aerosol optical depth determinations. This imprecision is particularly significant when the curve of the aerosol optical depth as a function of wavelength differs from the exponential law proposed by Angstrom.
A Novel Circular-Array Method to Infer Rayleigh-to-Love Power Partition Ratios Using Ambient Noise
Tada, T.; Cho, I.; Shinozaki, Y.
2009-12-01
The spatial autocorrelation (SPAC) method, a popular technique of ambient noise (microtremor) exploration that employs circular arrays, provides the possibility to simultaneously infer (1) phase velocities of Rayleigh waves (cR), (2) phase velocities of Love waves (cL), and (3) ratios of power partition between Rayleigh and Love waves (γ) using three-component records of ambient noise (Okada and Matsushima, 1989; Ferrazzini et al., 1991). In doing so, a nonlinear set of simultaneous equations has to be solved for three unknown parameters, so that the solution process can be fairly complicated. We have developed, by expanding the SPAC method, a novel technique that allows one to infer cL and γ by simple inversion of an observational equation, thereby obviating the need to solve simultaneous equations (Tada et al., 2009, BSSA October issue). Just like in the case of the SPAC method, records of ambient noise around a circle and at its center are all that is required as the input. Two-component horizontal-motion records suffice for the estimation of cL, whereas vertical-motion and one-component horizontal-motion records are necessary for the estimation of γ. How cL can be inferred using real data from the field is illustrated in our aforementioned paper, so in the present talk we focus on field illustrations of the γ estimation method. We analyzed real ambient noise data from site KSKB (Kasukabe), located in the northern suburbs of the Tokyo megalopolis (see Tada et al. [2009] for details). For data analysis, we used BIDO, a software package which we have developed on our own. BIDO is a versatile analysis tool that incorporates not only Tada et al.'s (2009) new methods, but also the traditional SPAC method and the whole range of new circular-array analysis methods which we have developed so far (Cho et al., 2006, GJI; Cho et al., 2006, JGR; Tada et al., 2007). We are offering access to BIDO and its user's manual on our URL (cited below; user registration solicited
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
Eide, Eric R.; Showalter, Mark H.
2012-01-01
Professors Richard J. Murnane and John B. Willett set out to capitalize on recent developments in education data and methodology by attempting to answer the following questions: How can new methods and data be applied most effectively in educational and social science research? What kinds of research designs are most appropriate? What kinds of…
A new method to infer vegetation boundary movement from 'snapshot' data
Eppinga, M.B.; Pucko, C.A.; Baudena, M.; Beckage, B.; Molofsky, J.
2012-01-01
Global change may induce shifts in plant community distributions at multiple spatial scales. At the ecosystem scale, such shifts may result in movement of ecotones or vegetation boundaries. Most indicators for ecosystem change require timeseries data, but here a new method is proposed enabling
Quantitative methods for ecological network analysis.
Ulanowicz, Robert E
2004-12-01
The analysis of networks of ecological trophic transfers is a useful complement to simulation modeling in the quest for understanding whole-ecosystem dynamics. Trophic networks can be studied in quantitative and systematic fashion at several levels. Indirect relationships between any two individual taxa in an ecosystem, which often differ in either nature or magnitude from their direct influences, can be assayed using techniques from linear algebra. The same mathematics can also be employed to ascertain where along the trophic continuum any individual taxon is operating, or to map the web of connections into a virtual linear chain that summarizes trophodynamic performance by the system. Backtracking algorithms with pruning have been written which identify pathways for the recycle of materials and energy within the system. The pattern of such cycling often reveals modes of control or types of functions exhibited by various groups of taxa. The performance of the system as a whole at processing material and energy can be quantified using information theory. In particular, the complexity of process interactions can be parsed into separate terms that distinguish organized, efficient performance from the capacity for further development and recovery from disturbance. Finally, the sensitivities of the information-theoretic system indices appear to identify the dynamical bottlenecks in ecosystem functioning.
Decision support systems and methods for complex networks
Huang, Zhenyu [Richland, WA; Wong, Pak Chung [Richland, WA; Ma, Jian [Richland, WA; Mackey, Patrick S [Richland, WA; Chen, Yousu [Richland, WA; Schneider, Kevin P [Seattle, WA
2012-02-28
Methods and systems for automated decision support in analyzing operation data from a complex network. Embodiments of the present invention utilize these algorithms and techniques not only to characterize the past and present condition of a complex network, but also to predict future conditions to help operators anticipate deteriorating and/or problem situations. In particular, embodiments of the present invention characterize network conditions from operation data using a state estimator. Contingency scenarios can then be generated based on those network conditions. For at least a portion of all of the contingency scenarios, risk indices are determined that describe the potential impact of each of those scenarios. Contingency scenarios with risk indices are presented visually as graphical representations in the context of a visual representation of the complex network. Analysis of the historical risk indices based on the graphical representations can then provide trends that allow for prediction of future network conditions.
Network Forensics Method Based on Evidence Graph and Vulnerability Reasoning
Directory of Open Access Journals (Sweden)
Jingsha He
2016-11-01
Full Text Available As the Internet becomes larger in scale, more complex in structure and more diversified in traffic, the number of crimes that utilize computer technologies is also increasing at a phenomenal rate. To react to the increasing number of computer crimes, the field of computer and network forensics has emerged. The general purpose of network forensics is to find malicious users or activities by gathering and dissecting firm evidences about computer crimes, e.g., hacking. However, due to the large volume of Internet traffic, not all the traffic captured and analyzed is valuable for investigation or confirmation. After analyzing some existing network forensics methods to identify common shortcomings, we propose in this paper a new network forensics method that uses a combination of network vulnerability and network evidence graph. In our proposed method, we use vulnerability evidence and reasoning algorithm to reconstruct attack scenarios and then backtrack the network packets to find the original evidences. Our proposed method can reconstruct attack scenarios effectively and then identify multi-staged attacks through evidential reasoning. Results of experiments show that the evidence graph constructed using our method is more complete and credible while possessing the reasoning capability.
Semantic Security Methods for Software-Defined Networks
Directory of Open Access Journals (Sweden)
Ekaterina Ju. Antoshina
2017-01-01
Full Text Available Software-defined networking is a promising technology for constructing communication networks where the network management is the software that configures network devices. This contrasts with the traditional point of view where the network behaviour is updated by manual configuration uploading to devices under control. The software controller allows dynamic routing configuration inside the net depending on the quality of service. However, there must be a proof that ensures that every network flow is secure, for example, we can define security policy as follows: confidential nodes can not send data to the public segment of the network. The paper shows how this problem can be solved by using a semantic security model. We propose a method that allows us to construct semantics that captures necessary security properties the network must follow. This involves the specification that states allowed and forbidden network flows. The specification is then modeled as a decision tree that may be reduced. We use the decision tree for semantic construction that captures security requirements. The semantic can be implemented as a module of the controller software so the correctness of the control plane of the network can be ensured on-the-fly.
A New Method to Infer Causal Phenotype Networks Using QTL and Phenotypic Information
Wang, H.; Eeuwijk, van F.
2014-01-01
In the context of genetics and breeding research on multiple phenotypic traits, reconstructing the directional or causal structure between phenotypic traits is a prerequisite for quantifying the effects of genetic interventions on the traits. Current approaches mainly exploit the genetic effects at
Analysis on the reconstruction accuracy of the Fitch method for inferring ancestral states
Directory of Open Access Journals (Sweden)
Grünewald Stefan
2011-01-01
Full Text Available Abstract Background As one of the most widely used parsimony methods for ancestral reconstruction, the Fitch method minimizes the total number of hypothetical substitutions along all branches of a tree to explain the evolution of a character. Due to the extensive usage of this method, it has become a scientific endeavor in recent years to study the reconstruction accuracies of the Fitch method. However, most studies are restricted to 2-state evolutionary models and a study for higher-state models is needed since DNA sequences take the format of 4-state series and protein sequences even have 20 states. Results In this paper, the ambiguous and unambiguous reconstruction accuracy of the Fitch method are studied for N-state evolutionary models. Given an arbitrary phylogenetic tree, a recurrence system is first presented to calculate iteratively the two accuracies. As complete binary tree and comb-shaped tree are the two extremal evolutionary tree topologies according to balance, we focus on the reconstruction accuracies on these two topologies and analyze their asymptotic properties. Then, 1000 Yule trees with 1024 leaves are generated and analyzed to simulate real evolutionary scenarios. It is known that more taxa not necessarily increase the reconstruction accuracies under 2-state models. The result under N-state models is also tested. Conclusions In a large tree with many leaves, the reconstruction accuracies of using all taxa are sometimes less than those of using a leaf subset under N-state models. For complete binary trees, there always exists an equilibrium interval [a, b] of conservation probability, in which the limiting ambiguous reconstruction accuracy equals to the probability of randomly picking a state. The value b decreases with the increase of the number of states, and it seems to converge. When the conservation probability is greater than b, the reconstruction accuracies of the Fitch method increase rapidly. The reconstruction
Wang, Mao; Maeda, Yoichiro; TAKAHASHI, Yasutake
2014-01-01
Intention recognition can use multiple factors as inputs such as gestures, face images and eye gaze position. On the other hand，eye tracking technology，with its special advantages of applying to Human-Computer Interaction (HCI)，can be utilized to develop assistant systems for people with mobility difficulties. In this paper, we propose gaze estimation position information as input of fuzzy inference to achieve intention recognition based on object recongition and construct an assistant system...
A model reduction method for biochemical reaction networks
National Research Council Canada - National Science Library
Rao, Shodhan; van der Schaft, Arjan; van Eunen, Karen; Bakker, Barbara; Jayawardhana, Bayu
2014-01-01
Background: In this paper we propose a model reduction method for biochemical reaction networks governed by a variety of reversible and irreversible enzyme kinetic rate laws, including reversible Michaelis-Menten and Hill kinetics...
Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong
2017-10-02
Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.
Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour.
Edwards, Andrew M; Freeman, Mervyn P; Breed, Greg A; Jonsen, Ian D
2012-01-01
Ecologists are collecting extensive data concerning movements of animals in marine ecosystems. Such data need to be analysed with valid statistical methods to yield meaningful conclusions. We demonstrate methodological issues in two recent studies that reached similar conclusions concerning movements of marine animals (Nature 451:1098; Science 332:1551). The first study analysed vertical movement data to conclude that diverse marine predators (Atlantic cod, basking sharks, bigeye tuna, leatherback turtles and Magellanic penguins) exhibited "Lévy-walk-like behaviour", close to a hypothesised optimal foraging strategy. By reproducing the original results for the bigeye tuna data, we show that the likelihood of tested models was calculated from residuals of regression fits (an incorrect method), rather than from the likelihood equations of the actual probability distributions being tested. This resulted in erroneous Akaike Information Criteria, and the testing of models that do not correspond to valid probability distributions. We demonstrate how this led to overwhelming support for a model that has no biological justification and that is statistically spurious because its probability density function goes negative. Re-analysis of the bigeye tuna data, using standard likelihood methods, overturns the original result and conclusion for that data set. The second study observed Lévy walk movement patterns by mussels. We demonstrate several issues concerning the likelihood calculations (including the aforementioned residuals issue). Re-analysis of the data rejects the original Lévy walk conclusion. We consequently question the claimed existence of scaling laws of the search behaviour of marine predators and mussels, since such conclusions were reached using incorrect methods. We discourage the suggested potential use of "Lévy-like walks" when modelling consequences of fishing and climate change, and caution that any resulting advice to managers of marine ecosystems
Incorrect likelihood methods were used to infer scaling laws of marine predator search behaviour.
Directory of Open Access Journals (Sweden)
Andrew M Edwards
Full Text Available BACKGROUND: Ecologists are collecting extensive data concerning movements of animals in marine ecosystems. Such data need to be analysed with valid statistical methods to yield meaningful conclusions. PRINCIPAL FINDINGS: We demonstrate methodological issues in two recent studies that reached similar conclusions concerning movements of marine animals (Nature 451:1098; Science 332:1551. The first study analysed vertical movement data to conclude that diverse marine predators (Atlantic cod, basking sharks, bigeye tuna, leatherback turtles and Magellanic penguins exhibited "Lévy-walk-like behaviour", close to a hypothesised optimal foraging strategy. By reproducing the original results for the bigeye tuna data, we show that the likelihood of tested models was calculated from residuals of regression fits (an incorrect method, rather than from the likelihood equations of the actual probability distributions being tested. This resulted in erroneous Akaike Information Criteria, and the testing of models that do not correspond to valid probability distributions. We demonstrate how this led to overwhelming support for a model that has no biological justification and that is statistically spurious because its probability density function goes negative. Re-analysis of the bigeye tuna data, using standard likelihood methods, overturns the original result and conclusion for that data set. The second study observed Lévy walk movement patterns by mussels. We demonstrate several issues concerning the likelihood calculations (including the aforementioned residuals issue. Re-analysis of the data rejects the original Lévy walk conclusion. CONCLUSIONS: We consequently question the claimed existence of scaling laws of the search behaviour of marine predators and mussels, since such conclusions were reached using incorrect methods. We discourage the suggested potential use of "Lévy-like walks" when modelling consequences of fishing and climate change, and caution
Smoothed Particle Inference: A Kilo-Parametric Method for X-ray Galaxy Cluster Modeling
Energy Technology Data Exchange (ETDEWEB)
Peterson, John R.; Marshall, P.J.; /KIPAC, Menlo Park; Andersson, K.; /Stockholm U. /SLAC
2005-08-05
We propose an ambitious new method that models the intracluster medium in clusters of galaxies as a set of X-ray emitting smoothed particles of plasma. Each smoothed particle is described by a handful of parameters including temperature, location, size, and elemental abundances. Hundreds to thousands of these particles are used to construct a model cluster of galaxies, with the appropriate complexity estimated from the data quality. This model is then compared iteratively with X-ray data in the form of adaptively binned photon lists via a two-sample likelihood statistic and iterated via Markov Chain Monte Carlo. The complex cluster model is propagated through the X-ray instrument response using direct sampling Monte Carlo methods. Using this approach the method can reproduce many of the features observed in the X-ray emission in a less assumption-dependent way that traditional analyses, and it allows for a more detailed characterization of the density, temperature, and metal abundance structure of clusters. Multi-instrument X-ray analyses and simultaneous X-ray, Sunyaev-Zeldovich (SZ), and lensing analyses are a straight-forward extension of this methodology. Significant challenges still exist in understanding the degeneracy in these models and the statistical noise induced by the complexity of the models.
PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods
Directory of Open Access Journals (Sweden)
Francisco Alexandre P
2012-05-01
Full Text Available Abstract Background With the d