Xu, Zhiqiang
2017-02-16
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke
2017-01-01
Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.
Energy Technology Data Exchange (ETDEWEB)
Winlaw, Manda [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); De Sterck, Hans [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sanders, Geoffrey [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2015-10-26
In very simple terms a network can be de ned as a collection of points joined together by lines. Thus, networks can be used to represent connections between entities in a wide variety of elds including engi- neering, science, medicine, and sociology. Many large real-world networks share a surprising number of properties, leading to a strong interest in model development research and techniques for building synthetic networks have been developed, that capture these similarities and replicate real-world graphs. Modeling these real-world networks serves two purposes. First, building models that mimic the patterns and prop- erties of real networks helps to understand the implications of these patterns and helps determine which patterns are important. If we develop a generative process to synthesize real networks we can also examine which growth processes are plausible and which are not. Secondly, high-quality, large-scale network data is often not available, because of economic, legal, technological, or other obstacles [7]. Thus, there are many instances where the systems of interest cannot be represented by a single exemplar network. As one example, consider the eld of cybersecurity, where systems require testing across diverse threat scenarios and validation across diverse network structures. In these cases, where there is no single exemplar network, the systems must instead be modeled as a collection of networks in which the variation among them may be just as important as their common features. By developing processes to build synthetic models, so-called graph generators, we can build synthetic networks that capture both the essential features of a system and realistic variability. Then we can use such synthetic graphs to perform tasks such as simulations, analysis, and decision making. We can also use synthetic graphs to performance test graph analysis algorithms, including clustering algorithms and anomaly detection algorithms.
A cluster algorithm for graphs
S. van Dongen
2000-01-01
textabstractA cluster algorithm for graphs called the emph{Markov Cluster algorithm (MCL~algorithm) is introduced. The algorithm provides basically an interface to an algebraic process defined on stochastic matrices, called the MCL~process. The graphs may be both weighted (with nonnegative weight)
CORECLUSTER: A Degeneracy Based Graph Clustering Framework
Giatsidis , Christos; Malliaros , Fragkiskos; Thilikos , Dimitrios M. ,; Vazirgiannis , Michalis
2014-01-01
International audience; Graph clustering or community detection constitutes an important task forinvestigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such asspectral methods, typically suffer from high time and space complexity. In thisarticle, we present \\textsc{CoreCluster}, an efficient graph clusteringframework based on the concept of graph degeneracy, that can be used along withany known graph clusteri...
A new cluster algorithm for graphs
S. van Dongen
1998-01-01
textabstractA new cluster algorithm for graphs called the emph{Markov Cluster algorithm ($MCL$ algorithm) is introduced. The graphs may be both weighted (with nonnegative weight) and directed. Let~$G$~be such a graph. The $MCL$ algorithm simulates flow in $G$ by first identifying $G$ in a
Graph coarsening and clustering on the GPU
Fagginger Auer, B.O.; Bisseling, R.H.
2013-01-01
Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph
Bayesian Decision Theoretical Framework for Clustering
Chen, Mo
2011-01-01
In this thesis, we establish a novel probabilistic framework for the data clustering problem from the perspective of Bayesian decision theory. The Bayesian decision theory view justifies the important questions: what is a cluster and what a clustering algorithm should optimize. We prove that the spectral clustering (to be specific, the…
Performance criteria for graph clustering and Markov cluster experiments
S. van Dongen
2000-01-01
textabstractIn~[1] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL~algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in~[2] establish an intrinsic
Finding the optimal Bayesian network given a constraint graph
Directory of Open Access Journals (Sweden)
Jacob M. Schreiber
2017-07-01
Full Text Available Despite recent algorithmic improvements, learning the optimal structure of a Bayesian network from data is typically infeasible past a few dozen variables. Fortunately, domain knowledge can frequently be exploited to achieve dramatic computational savings, and in many cases domain knowledge can even make structure learning tractable. Several methods have previously been described for representing this type of structural prior knowledge, including global orderings, super-structures, and constraint rules. While super-structures and constraint rules are flexible in terms of what prior knowledge they can encode, they achieve savings in memory and computational time simply by avoiding considering invalid graphs. We introduce the concept of a “constraint graph” as an intuitive method for incorporating rich prior knowledge into the structure learning task. We describe how this graph can be used to reduce the memory cost and computational time required to find the optimal graph subject to the encoded constraints, beyond merely eliminating invalid graphs. In particular, we show that a constraint graph can break the structure learning task into independent subproblems even in the presence of cyclic prior knowledge. These subproblems are well suited to being solved in parallel on a single machine or distributed across many machines without excessive communication cost.
A local search for a graph clustering problem
Navrotskaya, Anna; Il'ev, Victor
2016-10-01
In the clustering problems one has to partition a given set of objects (a data set) into some subsets (called clusters) taking into consideration only similarity of the objects. One of most visual formalizations of clustering is graph clustering, that is grouping the vertices of a graph into clusters taking into consideration the edge structure of the graph whose vertices are objects and edges represent similarities between the objects. In the graph k-clustering problem the number of clusters does not exceed k and the goal is to minimize the number of edges between clusters and the number of missing edges within clusters. This problem is NP-hard for any k ≥ 2. We propose a polynomial time (2k-1)-approximation algorithm for graph k-clustering. Then we apply a local search procedure to the feasible solution found by this algorithm and hold experimental research of obtained heuristics.
Image interpolation via graph-based Bayesian label propagation.
Xianming Liu; Debin Zhao; Jiantao Zhou; Wen Gao; Huifang Sun
2014-03-01
In this paper, we propose a novel image interpolation algorithm via graph-based Bayesian label propagation. The basic idea is to first create a graph with known and unknown pixels as vertices and with edge weights encoding the similarity between vertices, then the problem of interpolation converts to how to effectively propagate the label information from known points to unknown ones. This process can be posed as a Bayesian inference, in which we try to combine the principles of local adaptation and global consistency to obtain accurate and robust estimation. Specially, our algorithm first constructs a set of local interpolation models, which predict the intensity labels of all image samples, and a loss term will be minimized to keep the predicted labels of the available low-resolution (LR) samples sufficiently close to the original ones. Then, all of the losses evaluated in local neighborhoods are accumulated together to measure the global consistency on all samples. Moreover, a graph-Laplacian-based manifold regularization term is incorporated to penalize the global smoothness of intensity labels, such smoothing can alleviate the insufficient training of the local models and make them more robust. Finally, we construct a unified objective function to combine together the global loss of the locally linear regression, square error of prediction bias on the available LR samples, and the manifold regularization term. It can be solved with a closed-form solution as a convex optimization problem. Experimental results demonstrate that the proposed method achieves competitive performance with the state-of-the-art image interpolation algorithms.
Chemical graph-theoretic cluster expansions
International Nuclear Information System (INIS)
Klein, D.J.
1986-01-01
A general computationally amenable chemico-graph-theoretic cluster expansion method is suggested as a paradigm for incorporation of chemical structure concepts in a systematic manner. The cluster expansion approach is presented in a formalism general enough to cover a variety of empirical, semiempirical, and even ab initio applications. Formally such approaches for the utilization of chemical structure-related concepts may be viewed as discrete analogues of Taylor series expansions. The efficacy of the chemical structure concepts then is simply bound up in the rate of convergence of the cluster expansions. In many empirical applications, e.g., boiling points, chromatographic separation coefficients, and biological activities, this rate of convergence has been observed to be quite rapid. More note will be made here of quantum chemical applications. Relations to questions concerning size extensivity of energies and size consistency of wave functions are addressed
Statistical mechanics of semi-supervised clustering in sparse graphs
International Nuclear Information System (INIS)
Ver Steeg, Greg; Galstyan, Aram; Allahverdyan, Armen E
2011-01-01
We theoretically study semi-supervised clustering in sparse graphs in the presence of pair-wise constraints on the cluster assignments of nodes. We focus on bi-cluster graphs and study the impact of semi-supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within-cluster and between-cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine the impact of pair-wise constraints on the clustering accuracy. Our results suggest that the addition of constraints does not provide automatic improvement over the unsupervised case. When the density of the constraints is sufficiently small, their only impact is to shift the detection threshold while preserving the criticality. Conversely, if the density of (hard) constraints is above the percolation threshold, the criticality is suppressed and the detection threshold disappears
Graph-based clustering and data visualization algorithms
Vathy-Fogarassy, Ágnes
2013-01-01
This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
GraphAlignment: Bayesian pairwise alignment of biological networks
Directory of Open Access Journals (Sweden)
Kolář Michal
2012-11-01
Full Text Available Abstract Background With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. Results We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0. On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6. On empirical bacterial protein-protein interaction networks (PIN and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. Conclusions The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.
Cluster tails for critical power-law inhomogeneous random graphs
van der Hofstad, R.; Kliem, S.; van Leeuwaarden, J.S.H.
2018-01-01
Recently, the scaling limit of cluster sizes for critical inhomogeneous random graphs of rank-1 type having finite variance but infinite third moment degrees was obtained in Bhamidi et al. (Ann Probab 40:2299–2361, 2012). It was proved that when the degrees obey a power law with exponent τ∈ (3 , 4)
Locating sources within a dense sensor array using graph clustering
Gerstoft, P.; Riahi, N.
2017-12-01
We develop a model-free technique to identify weak sources within dense sensor arrays using graph clustering. No knowledge about the propagation medium is needed except that signal strengths decay to insignificant levels within a scale that is shorter than the aperture. We then reinterpret the spatial coherence matrix of a wave field as a matrix whose support is a connectivity matrix of a graph with sensors as vertices. In a dense network, well-separated sources induce clusters in this graph. The geographic spread of these clusters can serve to localize the sources. The support of the covariance matrix is estimated from limited-time data using a hypothesis test with a robust phase-only coherence test statistic combined with a physical distance criterion. The latter criterion ensures graph sparsity and thus prevents clusters from forming by chance. We verify the approach and quantify its reliability on a simulated dataset. The method is then applied to data from a dense 5200 element geophone array that blanketed of the city of Long Beach (CA). The analysis exposes a helicopter traversing the array and oil production facilities.
GraphAlignment: Bayesian pairwise alignment of biological networks
Czech Academy of Sciences Publication Activity Database
Kolář, Michal; Meier, J.; Mustonen, V.; Lässig, M.; Berg, J.
2012-01-01
Roč. 6, November 21 (2012) ISSN 1752-0509 Grant - others:Deutsche Forschungsgemeinschaft(DE) SFB 680; Deutsche Forschungsgemeinschaft(DE) SFB-TR12; Deutsche Forschungsgemeinschaft(DE) BE 2478/2-1 Institutional research plan: CEZ:AV0Z50520514 Keywords : Graph alignment * Biological networks * Parameter estimation * Bioconductor Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.982, year: 2012
Personalized PageRank Clustering: A graph clustering algorithm based on random walks
A. Tabrizi, Shayan; Shakery, Azadeh; Asadpour, Masoud; Abbasi, Maziar; Tavallaie, Mohammad Ali
2013-11-01
Graph clustering has been an essential part in many methods and thus its accuracy has a significant effect on many applications. In addition, exponential growth of real-world graphs such as social networks, biological networks and electrical circuits demands clustering algorithms with nearly-linear time and space complexity. In this paper we propose Personalized PageRank Clustering (PPC) that employs the inherent cluster exploratory property of random walks to reveal the clusters of a given graph. We combine random walks and modularity to precisely and efficiently reveal the clusters of a graph. PPC is a top-down algorithm so it can reveal inherent clusters of a graph more accurately than other nearly-linear approaches that are mainly bottom-up. It also gives a hierarchy of clusters that is useful in many applications. PPC has a linear time and space complexity and has been superior to most of the available clustering algorithms on many datasets. Furthermore, its top-down approach makes it a flexible solution for clustering problems with different requirements.
A cluster expansion approach to exponential random graph models
International Nuclear Information System (INIS)
Yin, Mei
2012-01-01
The exponential family of random graphs are among the most widely studied network models. We show that any exponential random graph model may alternatively be viewed as a lattice gas model with a finite Banach space norm. The system may then be treated using cluster expansion methods from statistical mechanics. In particular, we derive a convergent power series expansion for the limiting free energy in the case of small parameters. Since the free energy is the generating function for the expectations of other random variables, this characterizes the structure and behavior of the limiting network in this parameter region
Modification of MSDR algorithm and ITS implementation on graph clustering
Prastiwi, D.; Sugeng, K. A.; Siswantining, T.
2017-07-01
Maximum Standard Deviation Reduction (MSDR) is a graph clustering algorithm to minimize the distance variation within a cluster. In this paper we propose a modified MSDR by replacing one technical step in MSDR which uses polynomial regression, with a new and simpler step. This leads to our new algorithm called Modified MSDR (MMSDR). We implement the new algorithm to separate a domestic flight network of an Indonesian airline into two large clusters. Further analysis allows us to discover a weak link in the network, which should be improved by adding more flights.
Non-parametric Bayesian graph models reveal community structure in resting state fMRI
DEFF Research Database (Denmark)
Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman
2014-01-01
Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...... models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite...... between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background...
Cluster Tails for Critical Power-Law Inhomogeneous Random Graphs
van der Hofstad, Remco; Kliem, Sandra; van Leeuwaarden, Johan S. H.
2018-04-01
Recently, the scaling limit of cluster sizes for critical inhomogeneous random graphs of rank-1 type having finite variance but infinite third moment degrees was obtained in Bhamidi et al. (Ann Probab 40:2299-2361, 2012). It was proved that when the degrees obey a power law with exponent τ \\in (3,4), the sequence of clusters ordered in decreasing size and multiplied through by n^{-(τ -2)/(τ -1)} converges as n→ ∞ to a sequence of decreasing non-degenerate random variables. Here, we study the tails of the limit of the rescaled largest cluster, i.e., the probability that the scaling limit of the largest cluster takes a large value u, as a function of u. This extends a related result of Pittel (J Combin Theory Ser B 82(2):237-269, 2001) for the Erdős-Rényi random graph to the setting of rank-1 inhomogeneous random graphs with infinite third moment degrees. We make use of delicate large deviations and weak convergence arguments.
Learning Bayesian network structure: towards the essential graph by integer linear programming tools
Czech Academy of Sciences Publication Activity Database
Studený, Milan; Haws, D.
2014-01-01
Roč. 55, č. 4 (2014), s. 1043-1071 ISSN 0888-613X R&D Projects: GA ČR GA13-20012S Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * integer linear programming * characteristic imset * essential graph Subject RIV: BA - General Mathematics Impact factor: 2.451, year: 2014 http://library.utia.cas.cz/separaty/2014/MTR/studeny-0427002.pdf
Unsupervised active learning based on hierarchical graph-theoretic clustering.
Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve
2009-10-01
Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering
Directory of Open Access Journals (Sweden)
Xin Tian
2017-06-01
Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms
Chen, Pin-Yu; Hero, Alfred O.
2017-01-01
Multilayer graphs are commonly used for representing different relations between entities and handling heterogeneous data processing tasks. Non-standard multilayer graph clustering methods are needed for assigning clusters to a common multilayer node set and for combining information from each layer. This paper presents a multilayer spectral graph clustering (SGC) framework that performs convex layer aggregation. Under a multilayer signal plus noise model, we provide a phase transition analys...
Wang, Yang; Wu, Lin
2018-07-01
Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the following: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority, especially over recent state-of-the-art LRR models. Copyright © 2018 Elsevier Ltd. All rights reserved.
Low-Complexity Bayesian Estimation of Cluster-Sparse Channels
Ballal, Tarig; Al-Naffouri, Tareq Y.; Ahmed, Syed
2015-01-01
This paper addresses the problem of channel impulse response estimation for cluster-sparse channels under the Bayesian estimation framework. We develop a novel low-complexity minimum mean squared error (MMSE) estimator by exploiting the sparsity of the received signal profile and the structure of the measurement matrix. It is shown that due to the banded Toeplitz/circulant structure of the measurement matrix, a channel impulse response, such as underwater acoustic channel impulse responses, can be partitioned into a number of orthogonal or approximately orthogonal clusters. The orthogonal clusters, the sparsity of the channel impulse response and the structure of the measurement matrix, all combined, result in a computationally superior realization of the MMSE channel estimator. The MMSE estimator calculations boil down to simpler in-cluster calculations that can be reused in different clusters. The reduction in computational complexity allows for a more accurate implementation of the MMSE estimator. The proposed approach is tested using synthetic Gaussian channels, as well as simulated underwater acoustic channels. Symbol-error-rate performance and computation time confirm the superiority of the proposed method compared to selected benchmark methods in systems with preamble-based training signals transmitted over clustersparse channels.
Low-Complexity Bayesian Estimation of Cluster-Sparse Channels
Ballal, Tarig
2015-09-18
This paper addresses the problem of channel impulse response estimation for cluster-sparse channels under the Bayesian estimation framework. We develop a novel low-complexity minimum mean squared error (MMSE) estimator by exploiting the sparsity of the received signal profile and the structure of the measurement matrix. It is shown that due to the banded Toeplitz/circulant structure of the measurement matrix, a channel impulse response, such as underwater acoustic channel impulse responses, can be partitioned into a number of orthogonal or approximately orthogonal clusters. The orthogonal clusters, the sparsity of the channel impulse response and the structure of the measurement matrix, all combined, result in a computationally superior realization of the MMSE channel estimator. The MMSE estimator calculations boil down to simpler in-cluster calculations that can be reused in different clusters. The reduction in computational complexity allows for a more accurate implementation of the MMSE estimator. The proposed approach is tested using synthetic Gaussian channels, as well as simulated underwater acoustic channels. Symbol-error-rate performance and computation time confirm the superiority of the proposed method compared to selected benchmark methods in systems with preamble-based training signals transmitted over clustersparse channels.
Spectral clustering and biclustering learning large graphs and contingency tables
Bolla, Marianna
2013-01-01
Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, mult
Identifying Key Drivers of Return Reversal with Dynamical Bayesian Factor Graph.
Directory of Open Access Journals (Sweden)
Shuai Zhao
Full Text Available In the stock market, return reversal occurs when investors sell overbought stocks and buy oversold stocks, reversing the stocks' price trends. In this paper, we develop a new method to identify key drivers of return reversal by incorporating a comprehensive set of factors derived from different economic theories into one unified dynamical Bayesian factor graph. We then use the model to depict factor relationships and their dynamics, from which we make some interesting discoveries about the mechanism behind return reversals. Through extensive experiments on the US stock market, we conclude that among the various factors, the liquidity factors consistently emerge as key drivers of return reversal, which is in support of the theory of liquidity effect. Specifically, we find that stocks with high turnover rates or high Amihud illiquidity measures have a greater probability of experiencing return reversals. Apart from the consistent drivers, we find other drivers of return reversal that generally change from year to year, and they serve as important characteristics for evaluating the trends of stock returns. Besides, we also identify some seldom discussed yet enlightening inter-factor relationships, one of which shows that stocks in Finance and Insurance industry are more likely to have high Amihud illiquidity measures in comparison with those in other industries. These conclusions are robust for return reversals under different thresholds.
Anderson, Eric C; Ng, Thomas C
2016-02-01
We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
Directory of Open Access Journals (Sweden)
Theis Fabian J
2010-10-01
Full Text Available Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k-partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k-partite graph partitioning algorithm that allows for overlapping (fuzzy clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k-partite graph partitioning
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Co-clustering directed graphs to discover asymmetries and directional communities.
Rohe, Karl; Qin, Tai; Yu, Bin
2016-10-21
In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditis elegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction.
Bond percolation on a class of correlated and clustered random graphs
International Nuclear Information System (INIS)
Allard, A; Hébert-Dufresne, L; Noël, P-A; Marceau, V; Dubé, L J
2012-01-01
We introduce a formalism for computing bond percolation properties of a class of correlated and clustered random graphs. This class of graphs is a generalization of the configuration model where nodes of different types are connected via different types of hyperedges, edges that can link more than two nodes. We argue that the multitype approach coupled with the use of clustered hyperedges can reproduce a wide spectrum of complex patterns, and thus enhances our capability to model real complex networks. As an illustration of this claim, we use our formalism to highlight unusual behaviours of the size and composition of the components (small and giant) in a synthetic, albeit realistic, social network. (paper)
Clustering cliques for graph-based summarization of the biomedical research literature
DEFF Research Database (Denmark)
Zhang, Han; Fiszman, Marcelo; Shin, Dongwook
2013-01-01
Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: Sem......Rep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm...
An effective trust-based recommendation method using a novel graph clustering algorithm
Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin
2015-10-01
Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.
Graph analysis of cell clusters forming vascular networks
Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.
2018-03-01
This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.
GraphCrunch 2: Software tool for network modeling, alignment and clustering.
Kuchaiev, Oleksii; Stevanović, Aleksandar; Hayes, Wayne; Pržulj, Nataša
2011-01-19
Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an
GraphCrunch 2: Software tool for network modeling, alignment and clustering
Directory of Open Access Journals (Sweden)
Hayes Wayne
2011-01-01
Full Text Available Abstract Background Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. Results We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL" for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other
DEFF Research Database (Denmark)
Guillot, Gilles; Carpentier-Skandalis, Alexandra
2011-01-01
We study the accuracy of a Bayesian supervised method used to cluster individuals into genetically homogeneous groups on the basis of dominant or codominant molecular markers. We provide a formula relating an error criterion to the number of loci used and the number of clusters. This formula...
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon; Liang, Faming; Yuan, Ying
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we
GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.
Schulz, Tizian; Stoye, Jens; Doerr, Daniel
2018-05-08
Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Learning Based Approach for Optimal Clustering of Distributed Program's Call Flow Graph
Abofathi, Yousef; Zarei, Bager; Parsa, Saeed
Optimal clustering of call flow graph for reaching maximum concurrency in execution of distributable components is one of the NP-Complete problems. Learning automatas (LAs) are search tools which are used for solving many NP-Complete problems. In this paper a learning based algorithm is proposed to optimal clustering of call flow graph and appropriate distributing of programs in network level. The algorithm uses learning feature of LAs to search in state space. It has been shown that the speed of reaching to solution increases remarkably using LA in search process, and it also prevents algorithm from being trapped in local minimums. Experimental results show the superiority of proposed algorithm over others.
Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model
International Nuclear Information System (INIS)
Ellefsen, Karl J.; Smith, David B.
2016-01-01
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.
Graphic Symbol Recognition using Graph Based Signature and Bayesian Network Classifier
Luqman, Muhammad Muzzamil; Brouard, Thierry; Ramel, Jean-Yves
2010-01-01
We present a new approach for recognition of complex graphic symbols in technical documents. Graphic symbol recognition is a well known challenge in the field of document image analysis and is at heart of most graphic recognition systems. Our method uses structural approach for symbol representation and statistical classifier for symbol recognition. In our system we represent symbols by their graph based signatures: a graphic symbol is vectorized and is converted to an attributed relational g...
FUZZY CLUSTERING BASED BAYESIAN FRAMEWORK TO PREDICT MENTAL HEALTH PROBLEMS AMONG CHILDREN
Directory of Open Access Journals (Sweden)
M R Sumathi
2017-04-01
Full Text Available According to World Health Organization, 10-20% of children and adolescents all over the world are experiencing mental disorders. Correct diagnosis of mental disorders at an early stage improves the quality of life of children and avoids complicated problems. Various expert systems using artificial intelligence techniques have been developed for diagnosing mental disorders like Schizophrenia, Depression, Dementia, etc. This study focuses on predicting basic mental health problems of children, like Attention problem, Anxiety problem, Developmental delay, Attention Deficit Hyperactivity Disorder (ADHD, Pervasive Developmental Disorder(PDD, etc. using the machine learning techniques, Bayesian Networks and Fuzzy clustering. The focus of the article is on learning the Bayesian network structure using a novel Fuzzy Clustering Based Bayesian network structure learning framework. The performance of the proposed framework was compared with the other existing algorithms and the experimental results have shown that the proposed framework performs better than the earlier algorithms.
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Joint Bayesian variable and graph selection for regression models with network-structured predictors
Peterson, C. B.; Stingo, F. C.; Vannucci, M.
2015-01-01
In this work, we develop a Bayesian approach to perform selection of predictors that are linked within a network. We achieve this by combining a sparse regression model relating the predictors to a response variable with a graphical model describing conditional dependencies among the predictors. The proposed method is well-suited for genomic applications since it allows the identification of pathways of functionally related genes or proteins which impact an outcome of interest. In contrast to previous approaches for network-guided variable selection, we infer the network among predictors using a Gaussian graphical model and do not assume that network information is available a priori. We demonstrate that our method outperforms existing methods in identifying network-structured predictors in simulation settings, and illustrate our proposed model with an application to inference of proteins relevant to glioblastoma survival. PMID:26514925
Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.
Li, Zhaonan; Xu, Xinyi; Shen, Junshan
2017-11-10
In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.
Constructing a graph of connections in clustering algorithm of complex objects
Directory of Open Access Journals (Sweden)
Татьяна Шатовская
2015-05-01
Full Text Available The article describes the results of modifying the algorithm Chameleon. Hierarchical multi-level algorithm consists of several phases: the construction of the count, coarsening, the separation and recovery. Each phase can be used various approaches and algorithms. The main aim of the work is to study the quality of the clustering of different sets of data using a set of algorithms combinations at different stages of the algorithm and improve the stage of construction by the optimization algorithm of k choice in the graph construction of k of nearest neighbors
A NOVEL TECHNIQUE TO IMPROVE PHOTOMETRY IN CONFUSED IMAGES USING GRAPHS AND BAYESIAN PRIORS
International Nuclear Information System (INIS)
Safarzadeh, Mohammadtaher; Ferguson, Henry C.; Lu, Yu; Inami, Hanae; Somerville, Rachel S.
2015-01-01
We present a new technique for overcoming confusion noise in deep far-infrared Herschel space telescope images making use of prior information from shorter λ < 2 μm wavelengths. For the deepest images obtained by Herschel, the flux limit due to source confusion is about a factor of three brighter than the flux limit due to instrumental noise and (smooth) sky background. We have investigated the possibility of de-confusing simulated Herschel PACS 160 μm images by using strong Bayesian priors on the positions and weak priors on the flux of sources. We find the blended sources and group them together and simultaneously fit their fluxes. We derive the posterior probability distribution function of fluxes subject to these priors through Monte Carlo Markov Chain (MCMC) sampling by fitting the image. Assuming we can predict the FIR flux of sources based on the ultraviolet-optical part of their SEDs to within an order of magnitude, the simulations show that we can obtain reliable fluxes and uncertainties at least a factor of three fainter than the confusion noise limit of 3σ c = 2.7 mJy in our simulated PACS-160 image. This technique could in principle be used to mitigate the effects of source confusion in any situation where one has prior information of positions and plausible fluxes of blended sources. For Herschel, application of this technique will improve our ability to constrain the dust content in normal galaxies at high redshift
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data.
Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G
2018-01-01
Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
AGGLOMERATIVE CLUSTERING OF SOUND RECORD SPEECH SEGMENTS BASED ON BAYESIAN INFORMATION CRITERION
Directory of Open Access Journals (Sweden)
O. Yu. Kydashev
2013-01-01
Full Text Available This paper presents the detailed description of agglomerative clustering system implementation for speech segments based on Bayesian information criterion. Numerical experiment results with different acoustic features, as well as the full and diagonal covariance matrices application are given. The error rate DER equal to 6.4% for audio records of radio «Svoboda» was achieved by means of designed system.
Determining open cluster membership. A Bayesian framework for quantitative member classification
Stott, Jonathan J.
2018-01-01
Aims: My goal is to develop a quantitative algorithm for assessing open cluster membership probabilities. The algorithm is designed to work with single-epoch observations. In its simplest form, only one set of program images and one set of reference images are required. Methods: The algorithm is based on a two-stage joint astrometric and photometric assessment of cluster membership probabilities. The probabilities were computed within a Bayesian framework using any available prior information. Where possible, the algorithm emphasizes simplicity over mathematical sophistication. Results: The algorithm was implemented and tested against three observational fields using published survey data. M 67 and NGC 654 were selected as cluster examples while a third, cluster-free, field was used for the final test data set. The algorithm shows good quantitative agreement with the existing surveys and has a false-positive rate significantly lower than the astrometric or photometric methods used individually.
Directory of Open Access Journals (Sweden)
Wills Rachael A
2009-05-01
Full Text Available Abstract Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones, rather than objective reality. Bayesian analysis is (arguably a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol
2007-11-27
A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.
Satra, P.; Carsky, J.
2018-04-01
Our research is looking at the travel behaviour from a macroscopic view, taking one municipality as a basic unit. The travel behaviour of one municipality as a whole is becoming one piece of a data in the research of travel behaviour of a larger area, perhaps a country. A data pre-processing is used to cluster the municipalities in groups, which show similarities in their travel behaviour. Such groups can be then researched for reasons of their prevailing pattern of travel behaviour without any distortion caused by municipalities with a different pattern. This paper deals with actual settings of the clustering process, which is based on Bayesian statistics, particularly the mixture model. An optimization of the settings parameters based on correlation of pointer model parameters and relative number of data in clusters is helpful, however not fully reliable method. Thus, method for graphic representation of clusters needs to be developed in order to check their quality. A training of the setting parameters in 2D has proven to be a beneficial method, because it allows visual control of the produced clusters. The clustering better be applied on separate groups of municipalities, where competition of only identical transport modes can be found.
Directory of Open Access Journals (Sweden)
Klaudius eKalcher
2015-12-01
Full Text Available Identifying venous voxels in fMRI datasets is important to increase the specificity of fMRI analyses to microvasculature in the vicinity of the neural processes triggering the BOLD response. This is, however, difficult to achieve in particular in typical studies where magnitude images of BOLD EPI are the only data available. In this study, voxelwise functional connectivity graphs were computed on minimally preprocessed low TR (333 ms multiband resting-state fMRI data, using both high positive and negative correlations to define edges between nodes (voxels. A high correlation threshold for binarization ensures that most edges in the resulting sparse graph reflect the high coherence of signals in medium to large veins. Graph clustering based on the optimization of modularity was then employed to identify clusters of coherent voxels in this graph, and all clusters of 50 or more voxels were then interpreted as corresponding to medium to large veins. Indeed, a comparison with SWI reveals that 75.6 ± 5.9% of voxels within these large clusters overlap with veins visible in the SWI image or lie outside the brain parenchyma. Some of the remainingdifferences between the two modalities can be explained by imperfect alignment or geometric distortions between the two images. Overall, the graph clustering based method for identifying venous voxels has a high specificity as well as the additional advantages of being computed in the same voxel grid as the fMRI dataset itself and not needingany additional data beyond what is usually acquired (and exported in standard fMRI experiments.
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...
Directory of Open Access Journals (Sweden)
Deborah A Striegel
2015-08-01
Full Text Available Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.
Striegel, Deborah A; Hara, Manami; Periwal, Vipul
2015-08-01
Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.
Bayesian investigation of isochrone consistency using the old open cluster NGC 188
Energy Technology Data Exchange (ETDEWEB)
Hills, Shane; Courteau, Stéphane [Department of Physics, Engineering Physics and Astronomy, Queen’s University, Kingston, ON K7L 3N6 Canada (Canada); Von Hippel, Ted [Department of Physical Sciences, Embry-Riddle Aeronautical University, Daytona Beach, FL 32114 (United States); Geller, Aaron M., E-mail: shane.hills@queensu.ca, E-mail: courteau@astro.queensu.ca, E-mail: ted.vonhippel@erau.edu, E-mail: a-geller@northwestern.edu [Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA) and Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States)
2015-03-01
This paper provides a detailed comparison of the differences in parameters derived for a star cluster from its color–magnitude diagrams (CMDs) depending on the filters and models used. We examine the consistency and reliability of fitting three widely used stellar evolution models to 15 combinations of optical and near-IR photometry for the old open cluster NGC 188. The optical filter response curves match those of theoretical systems and are thus not the source of fit inconsistencies. NGC 188 is ideally suited to this study thanks to a wide variety of high-quality photometry and available proper motions and radial velocities that enable us to remove non-cluster members and many binaries. Our Bayesian fitting technique yields inferred values of age, metallicity, distance modulus, and absorption as a function of the photometric band combinations and stellar models. We show that the historically favored three-band combinations of UBV and VRI can be meaningfully inconsistent with each other and with longer baseline data sets such as UBVRIJHK{sub S}. Differences among model sets can also be substantial. For instance, fitting Yi et al. (2001) and Dotter et al. (2008) models to UBVRIJHK{sub S} photometry for NGC 188 yields the following cluster parameters: age = (5.78 ± 0.03, 6.45 ± 0.04) Gyr, [Fe/H] = (+0.125 ± 0.003, −0.077 ± 0.003) dex, (m−M){sub V} = (11.441 ± 0.007, 11.525 ± 0.005) mag, and A{sub V} = (0.162 ± 0.003, 0.236 ± 0.003) mag, respectively. Within the formal fitting errors, these two fits are substantially and statistically different. Such differences among fits using different filters and models are a cautionary tale regarding our current ability to fit star cluster CMDs. Additional modeling of this kind, with more models and star clusters, and future Gaia parallaxes are critical for isolating and quantifying the most relevant uncertainties in stellar evolutionary models.
Directory of Open Access Journals (Sweden)
Ali Dashti
Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
Dipnall, J F; Pasco, J A; Berk, M; Williams, L J; Dodd, S; Jacka, F N; Meyer, D
2017-01-01
Key lifestyle-environ risk factors are operative for depression, but it is unclear how risk factors cluster. Machine-learning (ML) algorithms exist that learn, extract, identify and map underlying patterns to identify groupings of depressed individuals without constraints. The aim of this research was to use a large epidemiological study to identify and characterise depression clusters through "Graphing lifestyle-environs using machine-learning methods" (GLUMM). Two ML algorithms were implemented: unsupervised Self-organised mapping (SOM) to create GLUMM clusters and a supervised boosted regression algorithm to describe clusters. Ninety-six "lifestyle-environ" variables were used from the National health and nutrition examination study (2009-2010). Multivariate logistic regression validated clusters and controlled for possible sociodemographic confounders. The SOM identified two GLUMM cluster solutions. These solutions contained one dominant depressed cluster (GLUMM5-1, GLUMM7-1). Equal proportions of members in each cluster rated as highly depressed (17%). Alcohol consumption and demographics validated clusters. Boosted regression identified GLUMM5-1 as more informative than GLUMM7-1. Members were more likely to: have problems sleeping; unhealthy eating; ≤2 years in their home; an old home; perceive themselves underweight; exposed to work fumes; experienced sex at ≤14 years; not perform moderate recreational activities. A positive relationship between GLUMM5-1 (OR: 7.50, Pdepression was found, with significant interactions with those married/living with partner (P=0.001). Using ML based GLUMM to form ordered depressive clusters from multitudinous lifestyle-environ variables enabled a deeper exploration of the heterogeneous data to uncover better understandings into relationships between the complex mental health factors. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Cucchi, K.; Kawa, N.; Hesse, F.; Rubin, Y.
2017-12-01
In order to reduce uncertainty in the prediction of subsurface flow and transport processes, practitioners should use all data available. However, classic inverse modeling frameworks typically only make use of information contained in in-situ field measurements to provide estimates of hydrogeological parameters. Such hydrogeological information about an aquifer is difficult and costly to acquire. In this data-scarce context, the transfer of ex-situ information coming from previously investigated sites can be critical for improving predictions by better constraining the estimation procedure. Bayesian inverse modeling provides a coherent framework to represent such ex-situ information by virtue of the prior distribution and combine them with in-situ information from the target site. In this study, we present an innovative data-driven approach for defining such informative priors for hydrogeological parameters at the target site. Our approach consists in two steps, both relying on statistical and machine learning methods. The first step is data selection; it consists in selecting sites similar to the target site. We use clustering methods for selecting similar sites based on observable hydrogeological features. The second step is data assimilation; it consists in assimilating data from the selected similar sites into the informative prior. We use a Bayesian hierarchical model to account for inter-site variability and to allow for the assimilation of multiple types of site-specific data. We present the application and validation of the presented methods on an established database of hydrogeological parameters. Data and methods are implemented in the form of an open-source R-package and therefore facilitate easy use by other practitioners.
Dexter, Alex; Race, Alan M; Steven, Rory T; Barnes, Jennifer R; Hulme, Heather; Goodwin, Richard J A; Styles, Iain B; Bunch, Josephine
2017-11-07
Clustering is widely used in MSI to segment anatomical features and differentiate tissue types, but existing approaches are both CPU and memory-intensive, limiting their application to small, single data sets. We propose a new approach that uses a graph-based algorithm with a two-phase sampling method that overcomes this limitation. We demonstrate the algorithm on a range of sample types and show that it can segment anatomical features that are not identified using commonly employed algorithms in MSI, and we validate our results on synthetic MSI data. We show that the algorithm is robust to fluctuations in data quality by successfully clustering data with a designed-in variance using data acquired with varying laser fluence. Finally, we show that this method is capable of generating accurate segmentations of large MSI data sets acquired on the newest generation of MSI instruments and evaluate these results by comparison with histopathology.
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
Trivedi, Shubhendu; Pardos, Zachary A.; Sarkozy, Gabor N.; Heffernan, Neil T.
2012-01-01
Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of…
Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina
2015-03-01
Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new
DEFF Research Database (Denmark)
Glover, Kevin A.; Hansen, Michael Møller; Skaala, Oystein
2009-01-01
44 cages located on 26 farms in the Hardangerfjord, western Norway. This fjord represents one of the major salmon farming areas in Norway, with a production of 57,000 t in 2007. Based upon genetic data from 17 microsatellite markers, significant but highly variable differentiation was observed among....... Accuracy of assignment varied greatly among the individual samples. For the Bayesian clustered data set consisting of five genetic groups, overall accuracy of self-assignment was 99%, demonstrating the effectiveness of this strategy to significantly increase accuracy of assignment, albeit at the expense...
Jing Chen
2015-01-01
This study takes the concept of food logistics distribution as the breakthrough point, by means of the aim of optimization of food logistics distribution routes and analysis of the optimization model of food logistics route, as well as the interpretation of the genetic algorithm, it discusses the optimization of food logistics distribution route based on genetic and cluster scheme algorithm.
Graph embedding with rich information through heterogeneous graph
Sun, Guolei
2017-01-01
Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most
Generating Realistic Labelled, Weighted Random Graphs
Directory of Open Access Journals (Sweden)
Michael Charles Davis
2015-12-01
Full Text Available Generative algorithms for random graphs have yielded insights into the structure and evolution of real-world networks. Most networks exhibit a well-known set of properties, such as heavy-tailed degree distributions, clustering and community formation. Usually, random graph models consider only structural information, but many real-world networks also have labelled vertices and weighted edges. In this paper, we present a generative model for random graphs with discrete vertex labels and numeric edge weights. The weights are represented as a set of Beta Mixture Models (BMMs with an arbitrary number of mixtures, which are learned from real-world networks. We propose a Bayesian Variational Inference (VI approach, which yields an accurate estimation while keeping computation times tractable. We compare our approach to state-of-the-art random labelled graph generators and an earlier approach based on Gaussian Mixture Models (GMMs. Our results allow us to draw conclusions about the contribution of vertex labels and edge weights to graph structure.
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.
A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.
Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei
2012-01-01
DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.
Novokmet, Natalija; Galov, Ana; Marjanović, Damir; Škaro, Vedrana; Projić, Petar; Lauc, Gordan; Primorac, Dragan; Rudan, Pavao
2015-01-01
The European Roma represent a transnational mosaic of minority population groups with different migration histories and contrasting experiences in their interactions with majority populations across the European continent. Although historical genetic contributions of European lineages to the Roma pool were investigated before, the extent of contemporary genetic admixture between Bayash Roma and non-Romani majority population remains elusive. The aim of this study was to assess the genetic structure of the Bayash Roma population from northwestern Croatia and the general Croatian population and to investigate the extent of admixture between them. A set of genetic data from two original studies (100 Bayash Roma from northwestern Croatia and 195 individuals from the general Croatian population) was analyzed by Bayesian clustering implemented in STRUCTURE software. By re-analyzing published data we intended to focus for the first time on genetic differentiation and structure and in doing so we clearly pointed to the importance of considering social phenomena in understanding genetic structuring. Our results demonstrated that two population clusters best explain the genetic structure, which is consistent with social exclusion of Roma and the demographic history of Bayash Roma who have settled in NW Croatia only about 150 years ago and mostly applied rules of endogamy. The presence of admixture was revealed, while the percentage of non-Croatian individuals in general Croatian population was approximately twofold higher than the percentage of non-Romani individuals in Roma population corroborating the presence of ethnomimicry in Roma.
Hensman, James; Lawrence, Neil D; Rattray, Magnus
2013-08-20
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.
Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka
2014-02-01
In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain.
Endriss, U.; Grandi, U.
Graph aggregation is the process of computing a single output graph that constitutes a good compromise between several input graphs, each provided by a different source. One needs to perform graph aggregation in a wide variety of situations, e.g., when applying a voting rule (graphs as preference
Ebrahimian, Hossein; Jalayer, Fatemeh
2017-08-29
In the immediate aftermath of a strong earthquake and in the presence of an ongoing aftershock sequence, scientific advisories in terms of seismicity forecasts play quite a crucial role in emergency decision-making and risk mitigation. Epidemic Type Aftershock Sequence (ETAS) models are frequently used for forecasting the spatio-temporal evolution of seismicity in the short-term. We propose robust forecasting of seismicity based on ETAS model, by exploiting the link between Bayesian inference and Markov Chain Monte Carlo Simulation. The methodology considers the uncertainty not only in the model parameters, conditioned on the available catalogue of events occurred before the forecasting interval, but also the uncertainty in the sequence of events that are going to happen during the forecasting interval. We demonstrate the methodology by retrospective early forecasting of seismicity associated with the 2016 Amatrice seismic sequence activities in central Italy. We provide robust spatio-temporal short-term seismicity forecasts with various time intervals in the first few days elapsed after each of the three main events within the sequence, which can predict the seismicity within plus/minus two standard deviations from the mean estimate within the few hours elapsed after the main event.
Directory of Open Access Journals (Sweden)
Arturo Medrano-Soto
2004-12-01
Full Text Available Based on mixture models, we present a Bayesian method (called BClass to classify biological entities (e.g. genes when variables of quite heterogeneous nature are analyzed. Various statistical distributions are used to model the continuous/categorical data commonly produced by genetic experiments and large-scale genomic projects. We calculate the posterior probability of each entry to belong to each element (group in the mixture. In this way, an original set of heterogeneous variables is transformed into a set of purely homogeneous characteristics represented by the probabilities of each entry to belong to the groups. The number of groups in the analysis is controlled dynamically by rendering the groups as 'alive' and 'dormant' depending upon the number of entities classified within them. Using standard Metropolis-Hastings and Gibbs sampling algorithms, we constructed a sampler to approximate posterior moments and grouping probabilities. Since this method does not require the definition of similarity measures, it is especially suitable for data mining and knowledge discovery in biological databases. We applied BClass to classify genes in RegulonDB, a database specialized in information about the transcriptional regulation of gene expression in the bacterium Escherichia coli. The classification obtained is consistent with current knowledge and allowed prediction of missing values for a number of genes. BClass is object-oriented and fully programmed in Lisp-Stat. The output grouping probabilities are analyzed and interpreted using graphical (dynamically linked plots and query-based approaches. We discuss the advantages of using Lisp-Stat as a programming language as well as the problems we faced when the data volume increased exponentially due to the ever-growing number of genomic projects.
Directory of Open Access Journals (Sweden)
Urbi Garay
2016-03-01
Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.
Zhang, L.-C.; Patone, M.
2017-01-01
We synthesise the existing theory of graph sampling. We propose a formal definition of sampling in finite graphs, and provide a classification of potential graph parameters. We develop a general approach of Horvitz–Thompson estimation to T-stage snowball sampling, and present various reformulations of some common network sampling methods in the literature in terms of the outlined graph sampling theory.
Hendrix, William; Jenkins, John; Padmanabhan, Kanchana; Chakraborty, Arpan
2014-01-01
Practical Graph Mining with R presents a "do-it-yourself" approach to extracting interesting patterns from graph data. It covers many basic and advanced techniques for the identification of anomalous or frequently recurring patterns in a graph, the discovery of groups or clusters of nodes that share common patterns of attributes and relationships, the extraction of patterns that distinguish one category of graphs from another, and the use of those patterns to predict the category of new graphs. Hands-On Application of Graph Data Mining Each chapter in the book focuses on a graph mining task, such as link analysis, cluster analysis, and classification. Through applications using real data sets, the book demonstrates how computational techniques can help solve real-world problems. The applications covered include network intrusion detection, tumor cell diagnostics, face recognition, predictive toxicology, mining metabolic and protein-protein interaction networks, and community detection in social networks. De...
Bayesian networks and food security - An introduction
Stein, A.
2004-01-01
This paper gives an introduction to Bayesian networks. Networks are defined and put into a Bayesian context. Directed acyclical graphs play a crucial role here. Two simple examples from food security are addressed. Possible uses of Bayesian networks for implementation and further use in decision
Cluster consensus in discrete-time networks of multiagents with inter-cluster nonidentical inputs.
Han, Yujuan; Lu, Wenlian; Chen, Tianping
2013-04-01
In this paper, cluster consensus of multiagent systems is studied via inter-cluster nonidentical inputs. Here, we consider general graph topologies, which might be time-varying. The cluster consensus is defined by two aspects: intracluster synchronization, the state at which differences between each pair of agents in the same cluster converge to zero, and inter-cluster separation, the state at which agents in different clusters are separated. For intra-cluster synchronization, the concepts and theories of consensus, including the spanning trees, scramblingness, infinite stochastic matrix product, and Hajnal inequality, are extended. As a result, it is proved that if the graph has cluster spanning trees and all vertices self-linked, then the static linear system can realize intra-cluster synchronization. For the time-varying coupling cases, it is proved that if there exists T > 0 such that the union graph across any T-length time interval has cluster spanning trees and all graphs has all vertices self-linked, then the time-varying linear system can also realize intra-cluster synchronization. Under the assumption of common inter-cluster influence, a sort of inter-cluster nonidentical inputs are utilized to realize inter-cluster separation, such that each agent in the same cluster receives the same inputs and agents in different clusters have different inputs. In addition, the boundedness of the infinite sum of the inputs can guarantee the boundedness of the trajectory. As an application, we employ a modified non-Bayesian social learning model to illustrate the effectiveness of our results.
Brouwer, A.E.; Haemers, W.H.; Brouwer, A.E.; Haemers, W.H.
2012-01-01
This chapter presents some simple results on graph spectra.We assume the reader is familiar with elementary linear algebra and graph theory. Throughout, J will denote the all-1 matrix, and 1 is the all-1 vector.
Wang, Yishu; Zhao, Hongyu; Deng, Minghua; Fang, Huaying; Yang, Dejie
2017-08-24
Epistatic miniarrary profile (EMAP) studies have enabled the mapping of large-scale genetic interaction networks and generated large amounts of data in model organisms. It provides an incredible set of molecular tools and advanced technologies that should be efficiently understanding the relationship between the genotypes and phenotypes of individuals. However, the network information gained from EMAP cannot be fully exploited using the traditional statistical network models. Because the genetic network is always heterogeneous, for example, the network structure features for one subset of nodes are different from those of the left nodes. Exponentialfamily random graph models (ERGMs) are a family of statistical models, which provide a principled and flexible way to describe the structural features (e.g. the density, centrality and assortativity) of an observed network. However, the single ERGM is not enough to capture this heterogeneity of networks. In this paper, we consider a mixture ERGM (MixtureEGRM) networks, which model a network with several communities, where each community is described by a single EGRM.
Directory of Open Access Journals (Sweden)
Peixin Zhao
2014-01-01
Full Text Available This paper suggests a novel clustering method for analyzing the National Incident-Based Reporting System (NIBRS data, which include the determination of correlation of different crime types, the development of a likelihood index for crimes to occur in a jurisdiction, and the clustering of jurisdictions based on crime type. The method was tested by using the 2005 assault data from 121 jurisdictions in Virginia as a test case. The analyses of these data show that some different crime types are correlated and some different crime parameters are correlated with different crime types. The analyses also show that certain jurisdictions within Virginia share certain crime patterns. This information assists with constructing a pattern for a specific crime type and can be used to determine whether a jurisdiction may be more likely to see this type of crime occur in their area.
DEFF Research Database (Denmark)
Vestergaard, Preben Dahl; Hartnell, Bert L.
2006-01-01
There are many results dealing with the problem of decomposing a fixed graph into isomorphic subgraphs. There has also been work on characterizing graphs with the property that one can delete the edges of a number of edge disjoint copies of the subgraph and, regardless of how that is done, the gr...
Current trends in Bayesian methodology with applications
Upadhyay, Satyanshu K; Dey, Dipak K; Loganathan, Appaia
2015-01-01
Collecting Bayesian material scattered throughout the literature, Current Trends in Bayesian Methodology with Applications examines the latest methodological and applied aspects of Bayesian statistics. The book covers biostatistics, econometrics, reliability and risk analysis, spatial statistics, image analysis, shape analysis, Bayesian computation, clustering, uncertainty assessment, high-energy astrophysics, neural networking, fuzzy information, objective Bayesian methodologies, empirical Bayes methods, small area estimation, and many more topics.Each chapter is self-contained and focuses on
Directory of Open Access Journals (Sweden)
Susanne Altermann
Full Text Available The inclusion of molecular data is increasingly an integral part of studies assessing species boundaries. Analyses based on predefined groups may obscure patterns of differentiation, and population assignment tests provide an alternative for identifying population structure and barriers to gene flow. In this study, we apply population assignment tests implemented in the programs STRUCTURE and BAPS to single nucleotide polymorphisms from DNA sequence data generated for three previous studies of the lichenized fungal genus Letharia. Previous molecular work employing a gene genealogical approach circumscribed six species-level lineages within the genus, four putative lineages within the nominal taxon L. columbiana (Nutt. J.W. Thomson and two sorediate lineages. We show that Bayesian clustering implemented in the program STRUCTURE was generally able to recover the same six putative Letharia lineages. Population assignments were largely consistent across a range of scenarios, including: extensive amounts of missing data, the exclusion of SNPs from variable markers, and inferences based on SNPs from as few as three gene regions. While our study provided additional evidence corroborating the six candidate Letharia species, the equivalence of these genetic clusters with species-level lineages is uncertain due, in part, to limited phylogenetic signal. Furthermore, both the BAPS analysis and the ad hoc ΔK statistic from results of the STRUCTURE analysis suggest that population structure can possibly be captured with fewer genetic groups. Our findings also suggest that uneven sampling across taxa may be responsible for the contrasting inferences of population substructure. Our results consistently supported two distinct sorediate groups, 'L. lupina' and L. vulpina, and subtle morphological differences support this distinction. Similarly, the putative apotheciate species 'L. lucida' was also consistently supported as a distinct genetic cluster. However
DEFF Research Database (Denmark)
Seiller, Thomas
2016-01-01
Interaction graphs were introduced as a general, uniform, construction of dynamic models of linear logic, encompassing all Geometry of Interaction (GoI) constructions introduced so far. This series of work was inspired from Girard's hyperfinite GoI, and develops a quantitative approach that should...... be understood as a dynamic version of weighted relational models. Until now, the interaction graphs framework has been shown to deal with exponentials for the constrained system ELL (Elementary Linear Logic) while keeping its quantitative aspect. Adapting older constructions by Girard, one can clearly define...... "full" exponentials, but at the cost of these quantitative features. We show here that allowing interpretations of proofs to use continuous (yet finite in a measure-theoretic sense) sets of states, as opposed to earlier Interaction Graphs constructions were these sets of states were discrete (and finite...
Trudeau, Richard J
1994-01-01
Preface1. Pure Mathematics Introduction; Euclidean Geometry as Pure Mathematics; Games; Why Study Pure Mathematics?; What's Coming; Suggested Reading2. Graphs Introduction; Sets; Paradox; Graphs; Graph diagrams; Cautions; Common Graphs; Discovery; Complements and Subgraphs; Isomorphism; Recognizing Isomorphic Graphs; Semantics The Number of Graphs Having a Given nu; Exercises; Suggested Reading3. Planar Graphs Introduction; UG, K subscript 5, and the Jordan Curve Theorem; Are there More Nonplanar Graphs?; Expansions; Kuratowski's Theorem; Determining Whether a Graph is Planar or
Bell inequalities for graph states
International Nuclear Information System (INIS)
Toth, G.; Hyllus, P.; Briegel, H.J.; Guehne, O.
2005-01-01
Full text: In the last years graph states have attracted an increasing interest in the field of quantum information theory. Graph states form a family of multi-qubit states which comprises many popular states such as the GHZ states and the cluster states. They also play an important role in applications. For instance, measurement based quantum computation uses graph states as resources. From a theoretical point of view, it is remarkable that graph states allow for a simple description in terms of stabilizing operators. In this contribution, we investigate the non-local properties of graph states. We derive a family of Bell inequalities which require three measurement settings for each party and are maximally violated by graph states. In turn, any graph state violates at least one of the inequalities. We show that for certain types of graph states the violation of these inequalities increases exponentially with the number of qubits. We also discuss connections to other entanglement properties such as the positively of the partial transpose or the geometric measure of entanglement. (author)
Network reconstruction via graph blending
Estrada, Rolando
2016-05-01
Graphs estimated from empirical data are often noisy and incomplete due to the difficulty of faithfully observing all the components (nodes and edges) of the true graph. This problem is particularly acute for large networks where the number of components may far exceed available surveillance capabilities. Errors in the observed graph can render subsequent analyses invalid, so it is vital to develop robust methods that can minimize these observational errors. Errors in the observed graph may include missing and spurious components, as well fused (multiple nodes are merged into one) and split (a single node is misinterpreted as many) nodes. Traditional graph reconstruction methods are only able to identify missing or spurious components (primarily edges, and to a lesser degree nodes), so we developed a novel graph blending framework that allows us to cast the full estimation problem as a simple edge addition/deletion problem. Armed with this framework, we systematically investigate the viability of various topological graph features, such as the degree distribution or the clustering coefficients, and existing graph reconstruction methods for tackling the full estimation problem. Our experimental results suggest that incorporating any topological feature as a source of information actually hinders reconstruction accuracy. We provide a theoretical analysis of this phenomenon and suggest several avenues for improving this estimation problem.
Diestel, Reinhard
2017-01-01
This standard textbook of modern graph theory, now in its fifth edition, combines the authority of a classic with the engaging freshness of style that is the hallmark of active mathematics. It covers the core material of the subject with concise yet reliably complete proofs, while offering glimpses of more advanced methods in each field by one or two deeper results, again with proofs given in full detail. The book can be used as a reliable text for an introductory course, as a graduate text, and for self-study. From the reviews: “This outstanding book cannot be substituted with any other book on the present textbook market. It has every chance of becoming the standard textbook for graph theory.”Acta Scientiarum Mathematiciarum “Deep, clear, wonderful. This is a serious book about the heart of graph theory. It has depth and integrity. ”Persi Diaconis & Ron Graham, SIAM Review “The book has received a very enthusiastic reception, which it amply deserves. A masterly elucidation of modern graph theo...
Graph embedding with rich information through heterogeneous graph
Sun, Guolei
2017-11-12
Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki.
Gould, Ronald
2012-01-01
This introduction to graph theory focuses on well-established topics, covering primary techniques and including both algorithmic and theoretical problems. The algorithms are presented with a minimum of advanced data structures and programming details. This thoroughly corrected 1988 edition provides insights to computer scientists as well as advanced undergraduates and graduate students of topology, algebra, and matrix theory. Fundamental concepts and notation and elementary properties and operations are the first subjects, followed by examinations of paths and searching, trees, and networks. S
Chartrand, Gary; Zhang, Ping
2010-01-01
Gary Chartrand has influenced the world of Graph Theory for almost half a century. He has supervised more than a score of Ph.D. dissertations and written several books on the subject. The most widely known of these texts, Graphs and Digraphs, … has much to recommend it, with clear exposition, and numerous challenging examples [that] make it an ideal textbook for the advanced undergraduate or beginning graduate course. The authors have updated their notation to reflect the current practice in this still-growing area of study. By the authors' estimation, the 5th edition is approximately 50% longer than the 4th edition. … the legendary Frank Harary, author of the second graph theory text ever produced, is one of the figures profiled. His book was the standard in the discipline for several decades. Chartrand, Lesniak and Zhang have produced a worthy successor.-John T. Saccoman, MAA Reviews, June 2012 (This book is in the MAA's basic library list.)As with the earlier editions, the current text emphasizes clear...
Bayes Academy - An Educational Game for Learning Bayesian Networks
Sotala, Kaj
2015-01-01
This thesis describes the development of 'Bayes Academy', an educational game which aims to teach an understanding of Bayesian networks. A Bayesian network is a directed acyclic graph describing a joint probability distribution function over n random variables, where each node in the graph represents a random variable. To find a way to turn this subject into an interesting game, this work draws on the theoretical background of meaningful play. Among other requirements, actions in the game...
Adaptation of Chain Event Graphs for use with Case-Control Studies in Epidemiology.
Keeble, Claire; Thwaites, Peter Adam; Barber, Stuart; Law, Graham Richard; Baxter, Paul David
2017-09-26
Case-control studies are used in epidemiology to try to uncover the causes of diseases, but are a retrospective study design known to suffer from non-participation and recall bias, which may explain their decreased popularity in recent years. Traditional analyses report usually only the odds ratio for given exposures and the binary disease status. Chain event graphs are a graphical representation of a statistical model derived from event trees which have been developed in artificial intelligence and statistics, and only recently introduced to the epidemiology literature. They are a modern Bayesian technique which enable prior knowledge to be incorporated into the data analysis using the agglomerative hierarchical clustering algorithm, used to form a suitable chain event graph. Additionally, they can account for missing data and be used to explore missingness mechanisms. Here we adapt the chain event graph framework to suit scenarios often encountered in case-control studies, to strengthen this study design which is time and financially efficient. We demonstrate eight adaptations to the graphs, which consist of two suitable for full case-control study analysis, four which can be used in interim analyses to explore biases, and two which aim to improve the ease and accuracy of analyses. The adaptations are illustrated with complete, reproducible, fully-interpreted examples, including the event tree and chain event graph. Chain event graphs are used here for the first time to summarise non-participation, data collection techniques, data reliability, and disease severity in case-control studies. We demonstrate how these features of a case-control study can be incorporated into the analysis to provide further insight, which can help to identify potential biases and lead to more accurate study results.
Chartrand, Gary; Rosen, Kenneth H
2008-01-01
Beginning with the origin of the four color problem in 1852, the field of graph colorings has developed into one of the most popular areas of graph theory. Introducing graph theory with a coloring theme, Chromatic Graph Theory explores connections between major topics in graph theory and graph colorings as well as emerging topics. This self-contained book first presents various fundamentals of graph theory that lie outside of graph colorings, including basic terminology and results, trees and connectivity, Eulerian and Hamiltonian graphs, matchings and factorizations, and graph embeddings. The remainder of the text deals exclusively with graph colorings. It covers vertex colorings and bounds for the chromatic number, vertex colorings of graphs embedded on surfaces, and a variety of restricted vertex colorings. The authors also describe edge colorings, monochromatic and rainbow edge colorings, complete vertex colorings, several distinguishing vertex and edge colorings, and many distance-related vertex coloring...
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
ZHOU, Lin
1996-01-01
In this paper I consider social choices under uncertainty. I prove that any social choice rule that satisfies independence of irrelevant alternatives, translation invariance, and weak anonymity is consistent with ex post Bayesian utilitarianism
Indian Academy of Sciences (India)
2017-09-27
Sep 27, 2017 ... Author for correspondence (zh4403701@126.com). MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...
Indian Academy of Sciences (India)
environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.
Using consensus bayesian network to model the reactive oxygen species regulatory pathway.
Directory of Open Access Journals (Sweden)
Liangdong Hu
Full Text Available Bayesian network is one of the most successful graph models for representing the reactive oxygen species regulatory pathway. With the increasing number of microarray measurements, it is possible to construct the bayesian network from microarray data directly. Although large numbers of bayesian network learning algorithms have been developed, when applying them to learn bayesian networks from microarray data, the accuracies are low due to that the databases they used to learn bayesian networks contain too few microarray data. In this paper, we propose a consensus bayesian network which is constructed by combining bayesian networks from relevant literatures and bayesian networks learned from microarray data. It would have a higher accuracy than the bayesian networks learned from one database. In the experiment, we validated the bayesian network combination algorithm on several classic machine learning databases and used the consensus bayesian network to model the Escherichia coli's ROS pathway.
Interactive Graph Layout of a Million Nodes
Directory of Open Access Journals (Sweden)
Peng Mi
2016-12-01
Full Text Available Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph topology. This algorithm can interactively layout graphs with millions of nodes, and support real-time interaction to explore alternative graph layouts. Users can directly manipulate the layout of vertices in a force-directed fashion. The complexity of traditional repulsive force computation is reduced by approximating calculations based on the hierarchical structure of multi-level clustered graphs. We evaluate the algorithm performance, and demonstrate human-in-the-loop layout in two sensemaking case studies. Moreover, we summarize lessons learned for designing interactive large graph layout algorithms on the GPU.
Graph visualization (Invited talk)
Wijk, van J.J.; Kreveld, van M.J.; Speckmann, B.
2012-01-01
Black and white node link diagrams are the classic method to depict graphs, but these often fall short to give insight in large graphs or when attributes of nodes and edges play an important role. Graph visualization aims obtaining insight in such graphs using interactive graphical representations.
Bayesian networks improve causal environmental ...
Rule-based weight of evidence approaches to ecological risk assessment may not account for uncertainties and generally lack probabilistic integration of lines of evidence. Bayesian networks allow causal inferences to be made from evidence by including causal knowledge about the problem, using this knowledge with probabilistic calculus to combine multiple lines of evidence, and minimizing biases in predicting or diagnosing causal relationships. Too often, sources of uncertainty in conventional weight of evidence approaches are ignored that can be accounted for with Bayesian networks. Specifying and propagating uncertainties improve the ability of models to incorporate strength of the evidence in the risk management phase of an assessment. Probabilistic inference from a Bayesian network allows evaluation of changes in uncertainty for variables from the evidence. The network structure and probabilistic framework of a Bayesian approach provide advantages over qualitative approaches in weight of evidence for capturing the impacts of multiple sources of quantifiable uncertainty on predictions of ecological risk. Bayesian networks can facilitate the development of evidence-based policy under conditions of uncertainty by incorporating analytical inaccuracies or the implications of imperfect information, structuring and communicating causal issues through qualitative directed graph formulations, and quantitatively comparing the causal power of multiple stressors on value
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Pragmatic Graph Rewriting Modifications
Rodgers, Peter; Vidal, Natalia
1999-01-01
We present new pragmatic constructs for easing programming in visual graph rewriting programming languages. The first is a modification to the rewriting process for nodes the host graph, where nodes specified as 'Once Only' in the LHS of a rewrite match at most once with a corresponding node in the host graph. This reduces the previously common use of tags to indicate the progress of matching in the graph. The second modification controls the application of LHS graphs, where those specified a...
A model of language inflection graphs
Fukś, Henryk; Farzad, Babak; Cao, Yi
2014-01-01
Inflection graphs are highly complex networks representing relationships between inflectional forms of words in human languages. For so-called synthetic languages, such as Latin or Polish, they have particularly interesting structure due to the abundance of inflectional forms. We construct the simplest form of inflection graphs, namely a bipartite graph in which one group of vertices corresponds to dictionary headwords and the other group to inflected forms encountered in a given text. We, then, study projection of this graph on the set of headwords. The projection decomposes into a large number of connected components, to be called word groups. Distribution of sizes of word group exhibits some remarkable properties, resembling cluster distribution in a lattice percolation near the critical point. We propose a simple model which produces graphs of this type, reproducing the desired component distribution and other topological features.
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....
Bapat, Ravindra B
2014-01-01
This new edition illustrates the power of linear algebra in the study of graphs. The emphasis on matrix techniques is greater than in other texts on algebraic graph theory. Important matrices associated with graphs (for example, incidence, adjacency and Laplacian matrices) are treated in detail. Presenting a useful overview of selected topics in algebraic graph theory, early chapters of the text focus on regular graphs, algebraic connectivity, the distance matrix of a tree, and its generalized version for arbitrary graphs, known as the resistance matrix. Coverage of later topics include Laplacian eigenvalues of threshold graphs, the positive definite completion problem and matrix games based on a graph. Such an extensive coverage of the subject area provides a welcome prompt for further exploration. The inclusion of exercises enables practical learning throughout the book. In the new edition, a new chapter is added on the line graph of a tree, while some results in Chapter 6 on Perron-Frobenius theory are reo...
Adaptive Graph Convolutional Neural Networks
Li, Ruoyu; Wang, Sheng; Zhu, Feiyun; Huang, Junzhou
2018-01-01
Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for eac...
Directory of Open Access Journals (Sweden)
C. Dalfo
2015-10-01
Full Text Available We study a family of graphs related to the $n$-cube. The middle cube graph of parameter k is the subgraph of $Q_{2k-1}$ induced by the set of vertices whose binary representation has either $k-1$ or $k$ number of ones. The middle cube graphs can be obtained from the well-known odd graphs by doubling their vertex set. Here we study some of the properties of the middle cube graphs in the light of the theory of distance-regular graphs. In particular, we completely determine their spectra (eigenvalues and their multiplicities, and associated eigenvectors.
Soetevent, A.R.
2010-01-01
This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. I propose an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. One feature of graph models of price competition is that spatial
Graphing Inequalities, Connecting Meaning
Switzer, J. Matt
2014-01-01
Students often have difficulty with graphing inequalities (see Filloy, Rojano, and Rubio 2002; Drijvers 2002), and J. Matt Switzer's students were no exception. Although students can produce graphs for simple inequalities, they often struggle when the format of the inequality is unfamiliar. Even when producing a correct graph of an…
Directory of Open Access Journals (Sweden)
Amine Labriji
2017-07-01
Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and offers a contribution to solving the problem mentioned above.
van Dam, Edwin R.; Koolen, Jack H.; Tanaka, Hajime
2016-01-01
This is a survey of distance-regular graphs. We present an introduction to distance-regular graphs for the reader who is unfamiliar with the subject, and then give an overview of some developments in the area of distance-regular graphs since the monograph 'BCN'[Brouwer, A.E., Cohen, A.M., Neumaier,
Fuzzy Graph Language Recognizability
Kalampakas , Antonios; Spartalis , Stefanos; Iliadis , Lazaros
2012-01-01
Part 5: Fuzzy Logic; International audience; Fuzzy graph language recognizability is introduced along the lines of the established theory of syntactic graph language recognizability by virtue of the algebraic structure of magmoids. The main closure properties of the corresponding class are investigated and several interesting examples of fuzzy graph languages are examined.
Brouwer, A.E.; Haemers, W.H.
2012-01-01
This book gives an elementary treatment of the basic material about graph spectra, both for ordinary, and Laplace and Seidel spectra. The text progresses systematically, by covering standard topics before presenting some new material on trees, strongly regular graphs, two-graphs, association
High Dimensional Spectral Graph Theory and Non-backtracking Random Walks on Graphs
Kempton, Mark
This thesis has two primary areas of focus. First we study connection graphs, which are weighted graphs in which each edge is associated with a d-dimensional rotation matrix for some fixed dimension d, in addition to a scalar weight. Second, we study non-backtracking random walks on graphs, which are random walks with the additional constraint that they cannot return to the immediately previous state at any given step. Our work in connection graphs is centered on the notion of consistency, that is, the product of rotations moving from one vertex to another is independent of the path taken, and a generalization called epsilon-consistency. We present higher dimensional versions of the combinatorial Laplacian matrix and normalized Laplacian matrix from spectral graph theory, and give results characterizing the consistency of a connection graph in terms of the spectra of these matrices. We generalize several tools from classical spectral graph theory, such as PageRank and effective resistance, to apply to connection graphs. We use these tools to give algorithms for sparsification, clustering, and noise reduction on connection graphs. In non-backtracking random walks, we address the question raised by Alon et. al. concerning how the mixing rate of a non-backtracking random walk to its stationary distribution compares to the mixing rate for an ordinary random walk. Alon et. al. address this question for regular graphs. We take a different approach, and use a generalization of Ihara's Theorem to give a new proof of Alon's result for regular graphs, and to extend the result to biregular graphs. Finally, we give a non-backtracking version of Polya's Random Walk Theorem for 2-dimensional grids.
Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
Alabdulmohsin, Ibrahim; Han, Yufei; Shen, Yun; Zhang, Xiangliang
2016-01-01
graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node
Network evolution driven by dynamics applied to graph coloring
International Nuclear Information System (INIS)
Wu Jian-She; Li Li-Guang; Yu Xin; Jiao Li-Cheng; Wang Xiao-Hua
2013-01-01
An evolutionary network driven by dynamics is studied and applied to the graph coloring problem. From an initial structure, both the topology and the coupling weights evolve according to the dynamics. On the other hand, the dynamics of the network are determined by the topology and the coupling weights, so an interesting structure-dynamics co-evolutionary scheme appears. By providing two evolutionary strategies, a network described by the complement of a graph will evolve into several clusters of nodes according to their dynamics. The nodes in each cluster can be assigned the same color and nodes in different clusters assigned different colors. In this way, a co-evolution phenomenon is applied to the graph coloring problem. The proposed scheme is tested on several benchmark graphs for graph coloring
Hell, Pavol
2004-01-01
This is a book about graph homomorphisms. Graph theory is now an established discipline but the study of graph homomorphisms has only recently begun to gain wide acceptance and interest. The subject gives a useful perspective in areas such as graph reconstruction, products, fractional and circular colourings, and has applications in complexity theory, artificial intelligence, telecommunication, and, most recently, statistical physics.Based on the authors' lecture notes for graduate courses, this book can be used as a textbook for a second course in graph theory at 4th year or master's level an
Simplicial complexes of graphs
Jonsson, Jakob
2008-01-01
A graph complex is a finite family of graphs closed under deletion of edges. Graph complexes show up naturally in many different areas of mathematics, including commutative algebra, geometry, and knot theory. Identifying each graph with its edge set, one may view a graph complex as a simplicial complex and hence interpret it as a geometric object. This volume examines topological properties of graph complexes, focusing on homotopy type and homology. Many of the proofs are based on Robin Forman's discrete version of Morse theory. As a byproduct, this volume also provides a loosely defined toolbox for attacking problems in topological combinatorics via discrete Morse theory. In terms of simplicity and power, arguably the most efficient tool is Forman's divide and conquer approach via decision trees; it is successfully applied to a large number of graph and digraph complexes.
Introduction to quantum graphs
Berkolaiko, Gregory
2012-01-01
A "quantum graph" is a graph considered as a one-dimensional complex and equipped with a differential operator ("Hamiltonian"). Quantum graphs arise naturally as simplified models in mathematics, physics, chemistry, and engineering when one considers propagation of waves of various nature through a quasi-one-dimensional (e.g., "meso-" or "nano-scale") system that looks like a thin neighborhood of a graph. Works that currently would be classified as discussing quantum graphs have been appearing since at least the 1930s, and since then, quantum graphs techniques have been applied successfully in various areas of mathematical physics, mathematics in general and its applications. One can mention, for instance, dynamical systems theory, control theory, quantum chaos, Anderson localization, microelectronics, photonic crystals, physical chemistry, nano-sciences, superconductivity theory, etc. Quantum graphs present many non-trivial mathematical challenges, which makes them dear to a mathematician's heart. Work on qu...
Image-Based Edge Bundles : Simplified Visualization of Large Graphs
Telea, A.; Ersoy, O.
2010-01-01
We present a new approach aimed at understanding the structure of connections in edge-bundling layouts. We combine the advantages of edge bundles with a bundle-centric simplified visual representation of a graph's structure. For this, we first compute a hierarchical edge clustering of a given graph
Dense graph limits under respondent-driven sampling
Indian Academy of Sciences (India)
Siva Athreya
Types of network. Food. Political. Social. Linkedin. Protien. Professional. Source: IUPUI Network Sampling course. Page 3. Techniques to Analyse Networks. Network Characteristics: • Distribution: degree, clustering coefficient, hop-plot. ... Hypothesis Testing: H0: Graph is a linkedin network. H1: Graph is a facebook network ...
Bessiere, Pierre; Ahuactzin, Juan Manuel; Mekhnacha, Kamel
2013-01-01
Probability as an Alternative to Boolean LogicWhile logic is the mathematical foundation of rational reasoning and the fundamental principle of computing, it is restricted to problems where information is both complete and certain. However, many real-world problems, from financial investments to email filtering, are incomplete or uncertain in nature. Probability theory and Bayesian computing together provide an alternative framework to deal with incomplete and uncertain data. Decision-Making Tools and Methods for Incomplete and Uncertain DataEmphasizing probability as an alternative to Boolean
Graphing trillions of triangles.
Burkhardt, Paul
2017-07-01
The increasing size of Big Data is often heralded but how data are transformed and represented is also profoundly important to knowledge discovery, and this is exemplified in Big Graph analytics. Much attention has been placed on the scale of the input graph but the product of a graph algorithm can be many times larger than the input. This is true for many graph problems, such as listing all triangles in a graph. Enabling scalable graph exploration for Big Graphs requires new approaches to algorithms, architectures, and visual analytics. A brief tutorial is given to aid the argument for thoughtful representation of data in the context of graph analysis. Then a new algebraic method to reduce the arithmetic operations in counting and listing triangles in graphs is introduced. Additionally, a scalable triangle listing algorithm in the MapReduce model will be presented followed by a description of the experiments with that algorithm that led to the current largest and fastest triangle listing benchmarks to date. Finally, a method for identifying triangles in new visual graph exploration technologies is proposed.
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics1
Energy Technology Data Exchange (ETDEWEB)
Simmhan, Yogesh; Kumbhare, Alok; Wickramaarachchi, Charith; Nagarkar, Soonil; Ravi, Santosh; Raghavendra, Cauligi; Prasanna, Viktor
2014-08-25
Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines the scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.
Community detection by graph Voronoi diagrams
Deritei, Dávid; Lázár, Zsolt I.; Papp, István; Járai-Szabó, Ferenc; Sumi, Róbert; Varga, Levente; Ravasz Regan, Erzsébet; Ercsey-Ravasz, Mária
2014-06-01
Accurate and efficient community detection in networks is a key challenge for complex network theory and its applications. The problem is analogous to cluster analysis in data mining, a field rich in metric space-based methods. Common to these methods is a geometric, distance-based definition of clusters or communities. Here we propose a new geometric approach to graph community detection based on graph Voronoi diagrams. Our method serves as proof of principle that the definition of appropriate distance metrics on graphs can bring a rich set of metric space-based clustering methods to network science. We employ a simple edge metric that reflects the intra- or inter-community character of edges, and a graph density-based rule to identify seed nodes of Voronoi cells. Our algorithm outperforms most network community detection methods applicable to large networks on benchmark as well as real-world networks. In addition to offering a computationally efficient alternative for community detection, our method opens new avenues for adapting a wide range of data mining algorithms to complex networks from the class of centroid- and density-based clustering methods.
DEFF Research Database (Denmark)
Böcker, S.; Baumbach, Jan
2013-01-01
. The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications......The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side...
Bayesian networks and boundedly rational expectations
Ran Spiegler
2014-01-01
I present a framework for analyzing decision makers with an imperfect understanding of their environment's correlation structure. The framework borrows the tool of "Bayesian networks", which is ubiquitous in statistics and artificial intelligence. In the model, a decision maker faces an objective multivariate probability distribution (his own action is one of the random variables). He is characterized by a directed acyclic graph over the set of random variables. His subjective belief filters ...
Characteristic imsets for learning Bayesian network structure
Czech Academy of Sciences Publication Activity Database
Hemmecke, R.; Lindner, S.; Studený, Milan
2012-01-01
Roč. 53, č. 9 (2012), s. 1336-1349 ISSN 0888-613X R&D Projects: GA MŠk(CZ) 1M0572; GA ČR GA201/08/0539 Institutional support: RVO:67985556 Keywords : learning Bayesian network structure * essential graph * standard imset * characteristic imset * LP relaxation of a polytope Subject RIV: BA - General Mathematics Impact factor: 1.729, year: 2012 http://library.utia.cas.cz/separaty/2012/MTR/studeny-0382596.pdf
Probability on graphs random processes on graphs and lattices
Grimmett, Geoffrey
2018-01-01
This introduction to some of the principal models in the theory of disordered systems leads the reader through the basics, to the very edge of contemporary research, with the minimum of technical fuss. Topics covered include random walk, percolation, self-avoiding walk, interacting particle systems, uniform spanning tree, random graphs, as well as the Ising, Potts, and random-cluster models for ferromagnetism, and the Lorentz model for motion in a random medium. This new edition features accounts of major recent progress, including the exact value of the connective constant of the hexagonal lattice, and the critical point of the random-cluster model on the square lattice. The choice of topics is strongly motivated by modern applications, and focuses on areas that merit further research. Accessible to a wide audience of mathematicians and physicists, this book can be used as a graduate course text. Each chapter ends with a range of exercises.
Topological structure of dictionary graphs
International Nuclear Information System (INIS)
Fuks, Henryk; Krzeminski, Mark
2009-01-01
We investigate the topological structure of the subgraphs of dictionary graphs constructed from WordNet and Moby thesaurus data. In the process of learning a foreign language, the learner knows only a subset of all words of the language, corresponding to a subgraph of a dictionary graph. When this subgraph grows with time, its topological properties change. We introduce the notion of the pseudocore and argue that the growth of the vocabulary roughly follows decreasing pseudocore numbers-that is, one first learns words with a high pseudocore number followed by smaller pseudocores. We also propose an alternative strategy for vocabulary growth, involving decreasing core numbers as opposed to pseudocore numbers. We find that as the core or pseudocore grows in size, the clustering coefficient first decreases, then reaches a minimum and starts increasing again. The minimum occurs when the vocabulary reaches a size between 10 3 and 10 4 . A simple model exhibiting similar behavior is proposed. The model is based on a generalized geometric random graph. Possible implications for language learning are discussed.
Chartrand, Gary
1984-01-01
Graph theory is used today in the physical sciences, social sciences, computer science, and other areas. Introductory Graph Theory presents a nontechnical introduction to this exciting field in a clear, lively, and informative style. Author Gary Chartrand covers the important elementary topics of graph theory and its applications. In addition, he presents a large variety of proofs designed to strengthen mathematical techniques and offers challenging opportunities to have fun with mathematics. Ten major topics - profusely illustrated - include: Mathematical Models, Elementary Concepts of Grap
Adriaan R. Soetevent
2010-01-01
This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. I propose an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. One feature of graph models of price competition is that spatial discontinuities in firm-level demand may occur. I show that the existence result of D'Aspremont et al. (1979) does not extend to simple star graphs. I conjecture that this non-existence result holds...
Pim Heijnen; Adriaan Soetevent
2014-01-01
This paper extends Hotelling's model of price competition with quadratic transportation costs from a line to graphs. We derive an algorithm to calculate firm-level demand for any given graph, conditional on prices and firm locations. These graph models of price competition may lead to spatial discontinuities in firm-level demand. We show that the existence result of D'Aspremont et al. (1979) does not extend to simple star graphs and conjecture that this non-existence result holds more general...
Directory of Open Access Journals (Sweden)
Aleks Kissinger
2014-03-01
Full Text Available String diagrams are a powerful tool for reasoning about physical processes, logic circuits, tensor networks, and many other compositional structures. Dixon, Duncan and Kissinger introduced string graphs, which are a combinatoric representations of string diagrams, amenable to automated reasoning about diagrammatic theories via graph rewrite systems. In this extended abstract, we show how the power of such rewrite systems can be greatly extended by introducing pattern graphs, which provide a means of expressing infinite families of rewrite rules where certain marked subgraphs, called !-boxes ("bang boxes", on both sides of a rule can be copied any number of times or removed. After reviewing the string graph formalism, we show how string graphs can be extended to pattern graphs and how pattern graphs and pattern rewrite rules can be instantiated to concrete string graphs and rewrite rules. We then provide examples demonstrating the expressive power of pattern graphs and how they can be applied to study interacting algebraic structures that are central to categorical quantum mechanics.
Gelfand, I M; Shnol, E E
1969-01-01
The second in a series of systematic studies by a celebrated mathematician I. M. Gelfand and colleagues, this volume presents students with a well-illustrated sequence of problems and exercises designed to illuminate the properties of functions and graphs. Since readers do not have the benefit of a blackboard on which a teacher constructs a graph, the authors abandoned the customary use of diagrams in which only the final form of the graph appears; instead, the book's margins feature step-by-step diagrams for the complete construction of each graph. The first part of the book employs simple fu
Creating more effective graphs
Robbins, Naomi B
2012-01-01
A succinct and highly readable guide to creating effective graphs The right graph can be a powerful tool for communicating information, improving a presentation, or conveying your point in print. If your professional endeavors call for you to present data graphically, here's a book that can help you do it more effectively. Creating More Effective Graphs gives you the basic knowledge and techniques required to choose and create appropriate graphs for a broad range of applications. Using real-world examples everyone can relate to, the author draws on her years of experience in gr
Energy Technology Data Exchange (ETDEWEB)
Lothian, Joshua [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Powers, Sarah S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sullivan, Blair D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Baker, Matthew B. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Schrock, Jonathan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Poole, Stephen W. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
2013-10-01
The benchmarking effort within the Extreme Scale Systems Center at Oak Ridge National Laboratory seeks to provide High Performance Computing benchmarks and test suites of interest to the DoD sponsor. The work described in this report is a part of the effort focusing on graph generation. A previously developed benchmark, SystemBurn, allowed the emulation of different application behavior profiles within a single framework. To complement this effort, similar capabilities are desired for graph-centric problems. This report examines existing synthetic graph generator implementations in preparation for further study on the properties of their generated synthetic graphs.
DEFF Research Database (Denmark)
Mansutti, Alessio; Miculan, Marino; Peressotti, Marco
2017-01-01
We introduce loose graph simulations (LGS), a new notion about labelled graphs which subsumes in an intuitive and natural way subgraph isomorphism (SGI), regular language pattern matching (RLPM) and graph simulation (GS). Being a unification of all these notions, LGS allows us to express directly...... also problems which are “mixed” instances of previous ones, and hence which would not fit easily in any of them. After the definition and some examples, we show that the problem of finding loose graph simulations is NP-complete, we provide formal translation of SGI, RLPM, and GS into LGSs, and we give...
Directory of Open Access Journals (Sweden)
Alberto Apostolico
2009-08-01
Full Text Available The Web Graph is a large-scale graph that does not fit in main memory, so that lossless compression methods have been proposed for it. This paper introduces a compression scheme that combines efficient storage with fast retrieval for the information in a node. The scheme exploits the properties of the Web Graph without assuming an ordering of the URLs, so that it may be applied to more general graphs. Tests on some datasets of use achieve space savings of about 10% over existing methods.
Graph Theory. 1. Fragmentation of Structural Graphs
Directory of Open Access Journals (Sweden)
Lorentz JÄNTSCHI
2002-12-01
Full Text Available The investigation of structural graphs has many fields of applications in engineering, especially in applied sciences like as applied chemistry and physics, computer sciences and automation, electronics and telecommunication. The main subject of the paper is to express fragmentation criteria in graph using a new method of investigation: terminal paths. Using terminal paths are defined most of the fragmentation criteria that are in use in molecular topology, but the fields of applications are more generally than that, as I mentioned before. Graphical examples of fragmentation are given for every fragmentation criteria. Note that all fragmentation is made with a computer program that implements a routine for every criterion.[1] A web routine for tracing all terminal paths in graph can be found at the address: http://vl.academicdirect.ro/molecular_topology/tpaths/ [1] M. V. Diudea, I. Gutman, L. Jäntschi, Molecular Topology, Nova Science, Commack, New York, 2001, 2002.
Bayesian Methods for Radiation Detection and Dosimetry
Groer, Peter G
2002-01-01
We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed comp...
Clustering execution in a processing system to increase power savings
Energy Technology Data Exchange (ETDEWEB)
Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.; Vega, Augusto J.
2018-04-03
Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.
Clustering execution in a processing system to increase power savings
Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.; Vega, Augusto J.
2018-03-20
Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.
A graph rewriting programming language for graph drawing
Rodgers, Peter
1998-01-01
This paper describes Grrr, a prototype visual graph drawing tool. Previously there were no visual languages for programming graph drawing algorithms despite the inherently visual nature of the process. The languages which gave a diagrammatic view of graphs were not computationally complete and so could not be used to implement complex graph drawing algorithms. Hence current graph drawing tools are all text based. Recent developments in graph rewriting systems have produced computationally com...
de Mol, M.J.; Rensink, Arend; Hunt, James J.
This paper introduces an approach for adding graph transformation-based functionality to existing JAVA programs. The approach relies on a set of annotations to identify the intended graph structure, as well as on user methods to manipulate that structure, within the user’s own JAVA class
Cohen, A.M.; Beineke, L.W.; Wilson, R.J.; Cameron, P.J.
2004-01-01
In this chapter we investigate the classification of distance-transitive graphs: these are graphs whose automorphism groups are transitive on each of the sets of pairs of vertices at distance i, for i = 0, 1,.... We provide an introduction into the field. By use of the classification of finite
Joyner, W David
2017-01-01
This textbook acts as a pathway to higher mathematics by seeking and illuminating the connections between graph theory and diverse fields of mathematics, such as calculus on manifolds, group theory, algebraic curves, Fourier analysis, cryptography and other areas of combinatorics. An overview of graph theory definitions and polynomial invariants for graphs prepares the reader for the subsequent dive into the applications of graph theory. To pique the reader’s interest in areas of possible exploration, recent results in mathematics appear throughout the book, accompanied with examples of related graphs, how they arise, and what their valuable uses are. The consequences of graph theory covered by the authors are complicated and far-reaching, so topics are always exhibited in a user-friendly manner with copious graphs, exercises, and Sage code for the computation of equations. Samples of the book’s source code can be found at github.com/springer-math/adventures-in-graph-theory. The text is geared towards ad...
Perepelitsa, VA; Sergienko, [No Value; Kochkarov, AM
1999-01-01
Definitions of prefractal and fractal graphs are introduced, and they are used to formulate mathematical models in different fields of knowledge. The topicality of fractal-graph recognition from the point of view, of fundamental improvement in the efficiency of the solution of algorithmic problems
DEFF Research Database (Denmark)
Husfeldt, Thore
2015-01-01
This chapter presents an introduction to graph colouring algorithms. The focus is on vertex-colouring algorithms that work for general classes of graphs with worst-case performance guarantees in a sequential model of computation. The presentation aims to demonstrate the breadth of available...
Packing Degenerate Graphs Greedily
Czech Academy of Sciences Publication Activity Database
Allen, P.; Böttcher, J.; Hladký, J.; Piguet, Diana
2017-01-01
Roč. 61, August (2017), s. 45-51 ISSN 1571-0653 R&D Projects: GA ČR GJ16-07822Y Institutional support: RVO:67985807 Keywords : tree packing conjecture * graph packing * graph processes Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics
Generalized connectivity of graphs
Li, Xueliang
2016-01-01
Noteworthy results, proof techniques, open problems and conjectures in generalized (edge-) connectivity are discussed in this book. Both theoretical and practical analyses for generalized (edge-) connectivity of graphs are provided. Topics covered in this book include: generalized (edge-) connectivity of graph classes, algorithms, computational complexity, sharp bounds, Nordhaus-Gaddum-type results, maximum generalized local connectivity, extremal problems, random graphs, multigraphs, relations with the Steiner tree packing problem and generalizations of connectivity. This book enables graduate students to understand and master a segment of graph theory and combinatorial optimization. Researchers in graph theory, combinatorics, combinatorial optimization, probability, computer science, discrete algorithms, complexity analysis, network design, and the information transferring models will find this book useful in their studies.
Approximation methods for efficient learning of Bayesian networks
Riggelsen, C
2008-01-01
This publication offers and investigates efficient Monte Carlo simulation methods in order to realize a Bayesian approach to approximate learning of Bayesian networks from both complete and incomplete data. For large amounts of incomplete data when Monte Carlo methods are inefficient, approximations are implemented, such that learning remains feasible, albeit non-Bayesian. The topics discussed are: basic concepts about probabilities, graph theory and conditional independence; Bayesian network learning from data; Monte Carlo simulation techniques; and, the concept of incomplete data. In order to provide a coherent treatment of matters, thereby helping the reader to gain a thorough understanding of the whole concept of learning Bayesian networks from (in)complete data, this publication combines in a clarifying way all the issues presented in the papers with previously unpublished work.
Autoregressive Moving Average Graph Filtering
Isufi, Elvin; Loukas, Andreas; Simonetto, Andrea; Leus, Geert
2016-01-01
One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogues of classical filters, but intended for signals defined on graphs. This work brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which (i) are able to approximate any desired graph frequency response, and (ii) give exact solutions for tasks such as graph signal denoising and interpolation. The design phi...
Subgraph detection using graph signals
Chepuri, Sundeep Prabhakar
2017-03-06
In this paper we develop statistical detection theory for graph signals. In particular, given two graphs, namely, a background graph that represents an usual activity and an alternative graph that represents some unusual activity, we are interested in answering the following question: To which of the two graphs does the observed graph signal fit the best? To begin with, we assume both the graphs are known, and derive an optimal Neyman-Pearson detector. Next, we derive a suboptimal detector for the case when the alternative graph is not known. The developed theory is illustrated with numerical experiments.
Subgraph detection using graph signals
Chepuri, Sundeep Prabhakar; Leus, Geert
2017-01-01
In this paper we develop statistical detection theory for graph signals. In particular, given two graphs, namely, a background graph that represents an usual activity and an alternative graph that represents some unusual activity, we are interested in answering the following question: To which of the two graphs does the observed graph signal fit the best? To begin with, we assume both the graphs are known, and derive an optimal Neyman-Pearson detector. Next, we derive a suboptimal detector for the case when the alternative graph is not known. The developed theory is illustrated with numerical experiments.
How Symmetric Are Real-World Graphs? A Large-Scale Study
Directory of Open Access Journals (Sweden)
Fabian Ball
2018-01-01
Full Text Available The analysis of symmetry is a main principle in natural sciences, especially physics. For network sciences, for example, in social sciences, computer science and data science, only a few small-scale studies of the symmetry of complex real-world graphs exist. Graph symmetry is a topic rooted in mathematics and is not yet well-received and applied in practice. This article underlines the importance of analyzing symmetry by showing the existence of symmetry in real-world graphs. An analysis of over 1500 graph datasets from the meta-repository networkrepository.com is carried out and a normalized version of the “network redundancy” measure is presented. It quantifies graph symmetry in terms of the number of orbits of the symmetry group from zero (no symmetries to one (completely symmetric, and improves the recognition of asymmetric graphs. Over 70% of the analyzed graphs contain symmetries (i.e., graph automorphisms, independent of size and modularity. Therefore, we conclude that real-world graphs are likely to contain symmetries. This contribution is the first larger-scale study of symmetry in graphs and it shows the necessity of handling symmetry in data analysis: The existence of symmetries in graphs is the cause of two problems in graph clustering we are aware of, namely, the existence of multiple equivalent solutions with the same value of the clustering criterion and, secondly, the inability of all standard partition-comparison measures of cluster analysis to identify automorphic partitions as equivalent.
Introduction to Bayesian statistics
Bolstad, William M
2017-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Bollobas, Bela
2004-01-01
The ever-expanding field of extremal graph theory encompasses a diverse array of problem-solving methods, including applications to economics, computer science, and optimization theory. This volume, based on a series of lectures delivered to graduate students at the University of Cambridge, presents a concise yet comprehensive treatment of extremal graph theory.Unlike most graph theory treatises, this text features complete proofs for almost all of its results. Further insights into theory are provided by the numerous exercises of varying degrees of difficulty that accompany each chapter. A
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Bayesian artificial intelligence
Korb, Kevin B
2003-01-01
As the power of Bayesian techniques has become more fully realized, the field of artificial intelligence has embraced Bayesian methodology and integrated it to the point where an introduction to Bayesian techniques is now a core course in many computer science programs. Unlike other books on the subject, Bayesian Artificial Intelligence keeps mathematical detail to a minimum and covers a broad range of topics. The authors integrate all of Bayesian net technology and learning Bayesian net technology and apply them both to knowledge engineering. They emphasize understanding and intuition but also provide the algorithms and technical background needed for applications. Software, exercises, and solutions are available on the authors' website.
Directory of Open Access Journals (Sweden)
Haynes Teresa W.
2014-08-01
Full Text Available A path π = (v1, v2, . . . , vk+1 in a graph G = (V,E is a downhill path if for every i, 1 ≤ i ≤ k, deg(vi ≥ deg(vi+1, where deg(vi denotes the degree of vertex vi ∈ V. The downhill domination number equals the minimum cardinality of a set S ⊆ V having the property that every vertex v ∈ V lies on a downhill path originating from some vertex in S. We investigate downhill domination numbers of graphs and give upper bounds. In particular, we show that the downhill domination number of a graph is at most half its order, and that the downhill domination number of a tree is at most one third its order. We characterize the graphs obtaining each of these bounds
Tailored Random Graph Ensembles
International Nuclear Information System (INIS)
Roberts, E S; Annibale, A; Coolen, A C C
2013-01-01
Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.
Alspach, BR
1985-01-01
This volume deals with a variety of problems involving cycles in graphs and circuits in digraphs. Leading researchers in this area present here 3 survey papers and 42 papers containing new results. There is also a collection of unsolved problems.
Wilson, Robin J
1985-01-01
Graph Theory has recently emerged as a subject in its own right, as well as being an important mathematical tool in such diverse subjects as operational research, chemistry, sociology and genetics. This book provides a comprehensive introduction to the subject.
Hyperbolicity in median graphs
Indian Academy of Sciences (India)
mic problems in hyperbolic spaces and hyperbolic graphs have been .... that in general the main obstacle is that we do not know the location of ...... [25] Jonckheere E and Lohsoonthorn P, A hyperbolic geometry approach to multipath routing,.
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
graphkernels: R and Python packages for graph comparison.
Sugiyama, Mahito; Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten
2018-02-01
Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch. Supplementary data are available online at Bioinformatics. © The Author(s) 2017. Published by Oxford University Press.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
Uniform Single Valued Neutrosophic Graphs
Directory of Open Access Journals (Sweden)
S. Broumi
2017-09-01
Full Text Available In this paper, we propose a new concept named the uniform single valued neutrosophic graph. An illustrative example and some properties are examined. Next, we develop an algorithmic approach for computing the complement of the single valued neutrosophic graph. A numerical example is demonstrated for computing the complement of single valued neutrosophic graphs and uniform single valued neutrosophic graph.
Collective Rationality in Graph Aggregation
Endriss, U.; Grandi, U.; Schaub, T.; Friedrich, G.; O'Sullivan, B.
2014-01-01
Suppose a number of agents each provide us with a directed graph over a common set of vertices. Graph aggregation is the problem of computing a single “collective” graph that best represents the information inherent in this profile of individual graphs. We consider this aggregation problem from the
International Nuclear Information System (INIS)
Barra, F.; Gaspard, P.
2001-01-01
We consider the classical evolution of a particle on a graph by using a time-continuous Frobenius-Perron operator that generalizes previous propositions. In this way, the relaxation rates as well as the chaotic properties can be defined for the time-continuous classical dynamics on graphs. These properties are given as the zeros of some periodic-orbit zeta functions. We consider in detail the case of infinite periodic graphs where the particle undergoes a diffusion process. The infinite spatial extension is taken into account by Fourier transforms that decompose the observables and probability densities into sectors corresponding to different values of the wave number. The hydrodynamic modes of diffusion are studied by an eigenvalue problem of a Frobenius-Perron operator corresponding to a given sector. The diffusion coefficient is obtained from the hydrodynamic modes of diffusion and has the Green-Kubo form. Moreover, we study finite but large open graphs that converge to the infinite periodic graph when their size goes to infinity. The lifetime of the particle on the open graph is shown to correspond to the lifetime of a system that undergoes a diffusion process before it escapes
Bollobás, Béla
1998-01-01
The time has now come when graph theory should be part of the education of every serious student of mathematics and computer science, both for its own sake and to enhance the appreciation of mathematics as a whole. This book is an in-depth account of graph theory, written with such a student in mind; it reflects the current state of the subject and emphasizes connections with other branches of pure mathematics. The volume grew out of the author's earlier book, Graph Theory -- An Introductory Course, but its length is well over twice that of its predecessor, allowing it to reveal many exciting new developments in the subject. Recognizing that graph theory is one of several courses competing for the attention of a student, the book contains extensive descriptive passages designed to convey the flavor of the subject and to arouse interest. In addition to a modern treatment of the classical areas of graph theory such as coloring, matching, extremal theory, and algebraic graph theory, the book presents a detailed ...
International Nuclear Information System (INIS)
Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)
1985-01-01
The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references
Characterization of inclusion neighbourhood in terms of the essential graph
Czech Academy of Sciences Publication Activity Database
Studený, Milan
2005-01-01
Roč. 38, č. 3 (2005), s. 283-309 ISSN 0888-613X R&D Projects: GA ČR GA201/04/0393; GA AV ČR IAA1075104 Institutional research plan: CEZ:AV0Z10750506 Keywords : learning Bayesian networks * inclusion neighbourhood * essential graph Subject RIV: BA - General Mathematics Impact factor: 0.959, year: 2005 http://library.utia.cas.cz/separaty/historie/studeny-0411275.pdf
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Marsman, M.; Wagenmakers, E.-J.
2017-01-01
We illustrate the Bayesian approach to data analysis using the newly developed statistical software program JASP. With JASP, researchers are able to take advantage of the benefits that the Bayesian framework has to offer in terms of parameter estimation and hypothesis testing. The Bayesian
Proxy Graph: Visual Quality Metrics of Big Graph Sampling.
Nguyen, Quan Hoang; Hong, Seok-Hee; Eades, Peter; Meidiana, Amyra
2017-06-01
Data sampling has been extensively studied for large scale graph mining. Many analyses and tasks become more efficient when performed on graph samples of much smaller size. The use of proxy objects is common in software engineering for analysis and interaction with heavy objects or systems. In this paper, we coin the term 'proxy graph' and empirically investigate how well a proxy graph visualization can represent a big graph. Our investigation focuses on proxy graphs obtained by sampling; this is one of the most common proxy approaches. Despite the plethora of data sampling studies, this is the first evaluation of sampling in the context of graph visualization. For an objective evaluation, we propose a new family of quality metrics for visual quality of proxy graphs. Our experiments cover popular sampling techniques. Our experimental results lead to guidelines for using sampling-based proxy graphs in visualization.
Learning a Health Knowledge Graph from Electronic Medical Records.
Rotmensch, Maya; Halpern, Yoni; Tlimat, Abdulhakim; Horng, Steven; Sontag, David
2017-07-20
Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).
On some covering graphs of a graph
Directory of Open Access Journals (Sweden)
Shariefuddin Pirzada
2016-10-01
Full Text Available For a graph $G$ with vertex set $V(G=\\{v_1, v_2, \\dots, v_n\\}$, let $S$ be the covering set of $G$ having the maximum degree over all the minimum covering sets of $G$. Let $N_S[v]=\\{u\\in S : uv \\in E(G \\}\\cup \\{v\\}$ be the closed neighbourhood of the vertex $v$ with respect to $S.$ We define a square matrix $A_S(G= (a_{ij},$ by $a_{ij}=1,$ if $\\left |N_S[v_i]\\cap N_S[v_j] \\right| \\geq 1, i\
Spanning Tree Based Attribute Clustering
DEFF Research Database (Denmark)
Zeng, Yifeng; Jorge, Cordero Hernandez
2009-01-01
Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algorithm. The algorithm partitions and indirectly discards...... inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering...
Learning Local Components to Understand Large Bayesian Networks
DEFF Research Database (Denmark)
Zeng, Yifeng; Xiang, Yanping; Cordero, Jorge
2009-01-01
(domain experts) to extract accurate information from a large Bayesian network due to dimensional difficulty. We define a formulation of local components and propose a clustering algorithm to learn such local components given complete data. The algorithm groups together most inter-relevant attributes......Bayesian networks are known for providing an intuitive and compact representation of probabilistic information and allowing the creation of models over a large and complex domain. Bayesian learning and reasoning are nontrivial for a large Bayesian network. In parallel, it is a tough job for users...... in a domain. We evaluate its performance on three benchmark Bayesian networks and provide results in support. We further show that the learned components may represent local knowledge more precisely in comparison to the full Bayesian networks when working with a small amount of data....
Fundamentals of algebraic graph transformation
Ehrig, Hartmut; Prange, Ulrike; Taentzer, Gabriele
2006-01-01
Graphs are widely used to represent structural information in the form of objects and connections between them. Graph transformation is the rule-based manipulation of graphs, an increasingly important concept in computer science and related fields. This is the first textbook treatment of the algebraic approach to graph transformation, based on algebraic structures and category theory. Part I is an introduction to the classical case of graph and typed graph transformation. In Part II basic and advanced results are first shown for an abstract form of replacement systems, so-called adhesive high-level replacement systems based on category theory, and are then instantiated to several forms of graph and Petri net transformation systems. Part III develops typed attributed graph transformation, a technique of key relevance in the modeling of visual languages and in model transformation. Part IV contains a practical case study on model transformation and a presentation of the AGG (attributed graph grammar) tool envir...
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
White, AT
1985-01-01
The field of topological graph theory has expanded greatly in the ten years since the first edition of this book appeared. The original nine chapters of this classic work have therefore been revised and updated. Six new chapters have been added, dealing with: voltage graphs, non-orientable imbeddings, block designs associated with graph imbeddings, hypergraph imbeddings, map automorphism groups and change ringing.Thirty-two new problems have been added to this new edition, so that there are now 181 in all; 22 of these have been designated as ``difficult'''' and 9 as ``unsolved''''. Three of the four unsolved problems from the first edition have been solved in the ten years between editions; they are now marked as ``difficult''''.
Ribes, Luis
2017-01-01
This book offers a detailed introduction to graph theoretic methods in profinite groups and applications to abstract groups. It is the first to provide a comprehensive treatment of the subject. The author begins by carefully developing relevant notions in topology, profinite groups and homology, including free products of profinite groups, cohomological methods in profinite groups, and fixed points of automorphisms of free pro-p groups. The final part of the book is dedicated to applications of the profinite theory to abstract groups, with sections on finitely generated subgroups of free groups, separability conditions in free and amalgamated products, and algorithms in free groups and finite monoids. Profinite Graphs and Groups will appeal to students and researchers interested in profinite groups, geometric group theory, graphs and connections with the theory of formal languages. A complete reference on the subject, the book includes historical and bibliographical notes as well as a discussion of open quest...
Subdominant pseudoultrametric on graphs
Energy Technology Data Exchange (ETDEWEB)
Dovgoshei, A A; Petrov, E A [Institute of Applied Mathematics and Mechanics, National Academy of Sciences of Ukraine, Donetsk (Ukraine)
2013-08-31
Let (G,w) be a weighted graph. We find necessary and sufficient conditions under which the weight w:E(G)→R{sup +} can be extended to a pseudoultrametric on V(G), and establish a criterion for the uniqueness of such an extension. We demonstrate that (G,w) is a complete k-partite graph, for k≥2, if and only if for any weight that can be extended to a pseudoultrametric, among all such extensions one can find the least pseudoultrametric consistent with w. We give a structural characterization of graphs for which the subdominant pseudoultrametric is an ultrametric for any strictly positive weight that can be extended to a pseudoultrametric. Bibliography: 14 titles.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Quantitative graph theory mathematical foundations and applications
Dehmer, Matthias
2014-01-01
The first book devoted exclusively to quantitative graph theory, Quantitative Graph Theory: Mathematical Foundations and Applications presents and demonstrates existing and novel methods for analyzing graphs quantitatively. Incorporating interdisciplinary knowledge from graph theory, information theory, measurement theory, and statistical techniques, this book covers a wide range of quantitative-graph theoretical concepts and methods, including those pertaining to real and random graphs such as:Comparative approaches (graph similarity or distance)Graph measures to characterize graphs quantitat
Efficient Extraction of High Centrality Vertices in Distributed Graphs
Energy Technology Data Exchange (ETDEWEB)
Kumbhare, Alok [Univ. of Southern California, Los Angeles, CA (United States); Frincu, Marc [Univ. of Southern California, Los Angeles, CA (United States); Raghavendra, Cauligi S. [Univ. of Southern California, Los Angeles, CA (United States); Prasanna, Viktor K. [Univ. of Southern California, Los Angeles, CA (United States)
2014-09-09
Betweenness centrality (BC) is an important measure for identifying high value or critical vertices in graphs, in variety of domains such as communication networks, road networks, and social graphs. However, calculating betweenness values is prohibitively expensive and, more often, domain experts are interested only in the vertices with the highest centrality values. In this paper, we first propose a partition-centric algorithm (MS-BC) to calculate BC for a large distributed graph that optimizes resource utilization and improves overall performance. Further, we extend the notion of approximate BC by pruning the graph and removing a subset of edges and vertices that contribute the least to the betweenness values of other vertices (MSL-BC), which further improves the runtime performance. We evaluate the proposed algorithms using a mix of real-world and synthetic graphs on an HPC cluster and analyze its strengths and weaknesses. The experimental results show an improvement in performance of upto 12x for large sparse graphs as compared to the state-of-the-art, and at the same time highlights the need for better partitioning methods to enable a balanced workload across partitions for unbalanced graphs such as small-world or power-law graphs.
Cheung, King Sing
2014-01-01
Petri nets are a formal and theoretically rich model for the modelling and analysis of systems. A subclass of Petri nets, augmented marked graphs possess a structure that is especially desirable for the modelling and analysis of systems with concurrent processes and shared resources.This monograph consists of three parts: Part I provides the conceptual background for readers who have no prior knowledge on Petri nets; Part II elaborates the theory of augmented marked graphs; finally, Part III discusses the application to system integration. The book is suitable as a first self-contained volume
Dayal, Amit; Brock, David
2018-01-01
Prashant Chandrasekar, a lead developer for the Social Interactome project, has tasked the team with creating a graph representation of the data collected from the social networks involved in that project. The data is currently stored in a MySQL database. The client requested that the graph database be Cayley, but after a literature review, Neo4j was chosen. The reasons for this shift will be explained in the design section. Secondarily, the team was tasked with coming up with three scena...
Stevanovic, Dragan
2015-01-01
Spectral Radius of Graphs provides a thorough overview of important results on the spectral radius of adjacency matrix of graphs that have appeared in the literature in the preceding ten years, most of them with proofs, and including some previously unpublished results of the author. The primer begins with a brief classical review, in order to provide the reader with a foundation for the subsequent chapters. Topics covered include spectral decomposition, the Perron-Frobenius theorem, the Rayleigh quotient, the Weyl inequalities, and the Interlacing theorem. From this introduction, the
Using Graph and Vertex Entropy to Compare Empirical Graphs with Theoretical Graph Models
Directory of Open Access Journals (Sweden)
Tomasz Kajdanowicz
2016-09-01
Full Text Available Over the years, several theoretical graph generation models have been proposed. Among the most prominent are: the Erdős–Renyi random graph model, Watts–Strogatz small world model, Albert–Barabási preferential attachment model, Price citation model, and many more. Often, researchers working with real-world data are interested in understanding the generative phenomena underlying their empirical graphs. They want to know which of the theoretical graph generation models would most probably generate a particular empirical graph. In other words, they expect some similarity assessment between the empirical graph and graphs artificially created from theoretical graph generation models. Usually, in order to assess the similarity of two graphs, centrality measure distributions are compared. For a theoretical graph model this means comparing the empirical graph to a single realization of a theoretical graph model, where the realization is generated from the given model using an arbitrary set of parameters. The similarity between centrality measure distributions can be measured using standard statistical tests, e.g., the Kolmogorov–Smirnov test of distances between cumulative distributions. However, this approach is both error-prone and leads to incorrect conclusions, as we show in our experiments. Therefore, we propose a new method for graph comparison and type classification by comparing the entropies of centrality measure distributions (degree centrality, betweenness centrality, closeness centrality. We demonstrate that our approach can help assign the empirical graph to the most similar theoretical model using a simple unsupervised learning method.
Handbook of graph grammars and computing by graph transformation
Engels, G; Kreowski, H J; Rozenberg, G
1999-01-01
Graph grammars originated in the late 60s, motivated by considerations about pattern recognition and compiler construction. Since then, the list of areas which have interacted with the development of graph grammars has grown quite impressively. Besides the aforementioned areas, it includes software specification and development, VLSI layout schemes, database design, modeling of concurrent systems, massively parallel computer architectures, logic programming, computer animation, developmental biology, music composition, visual languages, and many others.The area of graph grammars and graph tran
Topics in graph theory graphs and their Cartesian product
Imrich, Wilfried; Rall, Douglas F
2008-01-01
From specialists in the field, you will learn about interesting connections and recent developments in the field of graph theory by looking in particular at Cartesian products-arguably the most important of the four standard graph products. Many new results in this area appear for the first time in print in this book. Written in an accessible way, this book can be used for personal study in advanced applications of graph theory or for an advanced graph theory course.
Learning Probabilistic Decision Graphs
DEFF Research Database (Denmark)
Jaeger, Manfred; Dalgaard, Jens; Silander, Tomi
2004-01-01
efficient representations than Bayesian networks. In this paper we present an algorithm for learning PDGs from data. First experiments show that the algorithm is capable of learning optimal PDG representations in some cases, and that the computational efficiency of PDG models learned from real-life data...
Bisseling, R.H.; Byrka, J.; Cerav-Erbas, S.; Gvozdenovic, N.; Lorenz, M.; Pendavingh, R.A.; Reeves, C.; Röger, M.; Verhoeven, A.; Berg, van den J.B.; Bhulai, S.; Hulshof, J.; Koole, G.; Quant, C.; Williams, J.F.
2006-01-01
Splitting a large software system into smaller and more manageable units has become an important problem for many organizations. The basic structure of a software system is given by a directed graph with vertices representing the programs of the system and arcs representing calls from one program to
Coloring geographical threshold graphs
Energy Technology Data Exchange (ETDEWEB)
Bradonjic, Milan [Los Alamos National Laboratory; Percus, Allon [Los Alamos National Laboratory; Muller, Tobias [EINDHOVEN UNIV. OF TECH
2008-01-01
We propose a coloring algorithm for sparse random graphs generated by the geographical threshold graph (GTG) model, a generalization of random geometric graphs (RGG). In a GTG, nodes are distributed in a Euclidean space, and edges are assigned according to a threshold function involving the distance between nodes as well as randomly chosen node weights. The motivation for analyzing this model is that many real networks (e.g., wireless networks, the Internet, etc.) need to be studied by using a 'richer' stochastic model (which in this case includes both a distance between nodes and weights on the nodes). Here, we analyze the GTG coloring algorithm together with the graph's clique number, showing formally that in spite of the differences in structure between GTG and RGG, the asymptotic behavior of the chromatic number is identical: {chi}1n 1n n / 1n n (1 + {omicron}(1)). Finally, we consider the leading corrections to this expression, again using the coloring algorithm and clique number to provide bounds on the chromatic number. We show that the gap between the lower and upper bound is within C 1n n / (1n 1n n){sup 2}, and specify the constant C.
Budhiraja, A.S.; Mukherjee, D.; Wu, R.
2017-01-01
We consider a variation of the supermarket model in which the servers can communicate with their neighbors and where the neighborhood relationships are described in terms of a suitable graph. Tasks with unit-exponential service time distributions arrive at each vertex as independent Poisson
DEFF Research Database (Denmark)
Kucharik, Marcel; Hofacker, Ivo; Stadler, Peter
2014-01-01
of the folding free energy landscape, however, can provide the relevant information. Results We introduce the basin hopping graph (BHG) as a novel coarse-grained model of folding landscapes. Each vertex of the BHG is a local minimum, which represents the corresponding basin in the landscape. Its edges connect...
The STAPL Parallel Graph Library
Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Groupies in multitype random graphs
Shang, Yilun
2016-01-01
A groupie in a graph is a vertex whose degree is not less than the average degree of its neighbors. Under some mild conditions, we show that the proportion of groupies is very close to 1/2 in multitype random graphs (such as stochastic block models), which include Erd?s-R?nyi random graphs, random bipartite, and multipartite graphs as special examples. Numerical examples are provided to illustrate the theoretical results.
Groupies in multitype random graphs.
Shang, Yilun
2016-01-01
A groupie in a graph is a vertex whose degree is not less than the average degree of its neighbors. Under some mild conditions, we show that the proportion of groupies is very close to 1/2 in multitype random graphs (such as stochastic block models), which include Erdős-Rényi random graphs, random bipartite, and multipartite graphs as special examples. Numerical examples are provided to illustrate the theoretical results.
Temporal Representation in Semantic Graphs
Energy Technology Data Exchange (ETDEWEB)
Levandoski, J J; Abdulla, G M
2007-08-07
A wide range of knowledge discovery and analysis applications, ranging from business to biological, make use of semantic graphs when modeling relationships and concepts. Most of the semantic graphs used in these applications are assumed to be static pieces of information, meaning temporal evolution of concepts and relationships are not taken into account. Guided by the need for more advanced semantic graph queries involving temporal concepts, this paper surveys the existing work involving temporal representations in semantic graphs.
Quantum walks on quotient graphs
International Nuclear Information System (INIS)
Krovi, Hari; Brun, Todd A.
2007-01-01
A discrete-time quantum walk on a graph Γ is the repeated application of a unitary evolution operator to a Hilbert space corresponding to the graph. If this unitary evolution operator has an associated group of symmetries, then for certain initial states the walk will be confined to a subspace of the original Hilbert space. Symmetries of the original graph, given by its automorphism group, can be inherited by the evolution operator. We show that a quantum walk confined to the subspace corresponding to this symmetry group can be seen as a different quantum walk on a smaller quotient graph. We give an explicit construction of the quotient graph for any subgroup H of the automorphism group and illustrate it with examples. The automorphisms of the quotient graph which are inherited from the original graph are the original automorphism group modulo the subgroup H used to construct it. The quotient graph is constructed by removing the symmetries of the subgroup H from the original graph. We then analyze the behavior of hitting times on quotient graphs. Hitting time is the average time it takes a walk to reach a given final vertex from a given initial vertex. It has been shown in earlier work [Phys. Rev. A 74, 042334 (2006)] that the hitting time for certain initial states of a quantum walks can be infinite, in contrast to classical random walks. We give a condition which determines whether the quotient graph has infinite hitting times given that they exist in the original graph. We apply this condition for the examples discussed and determine which quotient graphs have infinite hitting times. All known examples of quantum walks with hitting times which are short compared to classical random walks correspond to systems with quotient graphs much smaller than the original graph; we conjecture that the existence of a small quotient graph with finite hitting times is necessary for a walk to exhibit a quantum speedup
A generalization of total graphs
Indian Academy of Sciences (India)
M Afkhami
2018-04-12
Apr 12, 2018 ... product of any lower triangular matrix with the transpose of any element of U belongs to U. The ... total graph of R, which is denoted by T( (R)), is a simple graph with all elements of R as vertices, and ...... [9] Badawi A, On dot-product graph of a commutative ring, Communications in Algebra 43 (2015). 43–50.
Graph transformation tool contest 2008
Rensink, Arend; van Gorp, Pieter
This special section is the outcome of the graph transformation tool contest organised during the Graph-Based Tools (GraBaTs) 2008 workshop, which took place as a satellite event of the International Conference on Graph Transformation (ICGT) 2008. The contest involved two parts: three “off-line case
On dominator colorings in graphs
Indian Academy of Sciences (India)
colors required for a dominator coloring of G is called the dominator .... Theorem 1.3 shows that the complete graph Kn is the only connected graph of order n ... Conversely, if a graph G satisfies condition (i) or (ii), it is easy to see that χd(G) =.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng
2015-12-01
Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs.
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Bayesian statistics an introduction
Lee, Peter M
2012-01-01
Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee’s book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as wel
Algorithms for Planar Graphs and Graphs in Metric Spaces
DEFF Research Database (Denmark)
Wulff-Nilsen, Christian
structural properties that can be exploited. For instance, a road network or a wire layout on a microchip is typically (near-)planar and distances in the network are often defined w.r.t. the Euclidean or the rectilinear metric. Specialized algorithms that take advantage of such properties are often orders...... of magnitude faster than the corresponding algorithms for general graphs. The first and main part of this thesis focuses on the development of efficient planar graph algorithms. The most important contributions include a faster single-source shortest path algorithm, a distance oracle with subquadratic...... for geometric graphs and graphs embedded in metric spaces. Roughly speaking, the stretch factor is a real value expressing how well a (geo-)metric graph approximates the underlying complete graph w.r.t. distances. We give improved algorithms for computing the stretch factor of a given graph and for augmenting...
Market Segmentation Using Bayesian Model Based Clustering
Van Hattum, P.
2009-01-01
This dissertation deals with two basic problems in marketing, that are market segmentation, which is the grouping of persons who share common aspects, and market targeting, which is focusing your marketing efforts on one or more attractive market segments. For the grouping of persons who share
Harary, Frank
2015-01-01
Presented in 1962-63 by experts at University College, London, these lectures offer a variety of perspectives on graph theory. Although the opening chapters form a coherent body of graph theoretic concepts, this volume is not a text on the subject but rather an introduction to the extensive literature of graph theory. The seminar's topics are geared toward advanced undergraduate students of mathematics.Lectures by this volume's editor, Frank Harary, include ""Some Theorems and Concepts of Graph Theory,"" ""Topological Concepts in Graph Theory,"" ""Graphical Reconstruction,"" and other introduc
Spectral fluctuations of quantum graphs
International Nuclear Information System (INIS)
Pluhař, Z.; Weidenmüller, H. A.
2014-01-01
We prove the Bohigas-Giannoni-Schmit conjecture in its most general form for completely connected simple graphs with incommensurate bond lengths. We show that for graphs that are classically mixing (i.e., graphs for which the spectrum of the classical Perron-Frobenius operator possesses a finite gap), the generating functions for all (P,Q) correlation functions for both closed and open graphs coincide (in the limit of infinite graph size) with the corresponding expressions of random-matrix theory, both for orthogonal and for unitary symmetry
Dynamic Representations of Sparse Graphs
DEFF Research Database (Denmark)
Brodal, Gerth Stølting; Fagerberg, Rolf
1999-01-01
We present a linear space data structure for maintaining graphs with bounded arboricity—a large class of sparse graphs containing e.g. planar graphs and graphs of bounded treewidth—under edge insertions, edge deletions, and adjacency queries. The data structure supports adjacency queries in worst...... case O(c) time, and edge insertions and edge deletions in amortized O(1) and O(c+log n) time, respectively, where n is the number of nodes in the graph, and c is the bound on the arboricity....
Domination criticality in product graphs
Directory of Open Access Journals (Sweden)
M.R. Chithra
2015-07-01
Full Text Available A connected dominating set is an important notion and has many applications in routing and management of networks. Graph products have turned out to be a good model of interconnection networks. This motivated us to study the Cartesian product of graphs G with connected domination number, γc(G=2,3 and characterize such graphs. Also, we characterize the k−γ-vertex (edge critical graphs and k−γc-vertex (edge critical graphs for k=2,3 where γ denotes the domination number of G. We also discuss the vertex criticality in grids.
Graph Creation, Visualisation and Transformation
Directory of Open Access Journals (Sweden)
Maribel Fernández
2010-03-01
Full Text Available We describe a tool to create, edit, visualise and compute with interaction nets - a form of graph rewriting systems. The editor, called GraphPaper, allows users to create and edit graphs and their transformation rules using an intuitive user interface. The editor uses the functionalities of the TULIP system, which gives us access to a wealth of visualisation algorithms. Interaction nets are not only a formalism for the specification of graphs, but also a rewrite-based computation model. We discuss graph rewriting strategies and a language to express them in order to perform strategic interaction net rewriting.
Study of Chromatic parameters of Line, Total, Middle graphs and Graph operators of Bipartite graph
Nagarathinam, R.; Parvathi, N.
2018-04-01
Chromatic parameters have been explored on the basis of graph coloring process in which a couple of adjacent nodes receives different colors. But the Grundy and b-coloring executes maximum colors under certain restrictions. In this paper, Chromatic, b-chromatic and Grundy number of some graph operators of bipartite graph has been investigat
Graph Sampling for Covariance Estimation
Chepuri, Sundeep Prabhakar
2017-04-25
In this paper the focus is on subsampling as well as reconstructing the second-order statistics of signals residing on nodes of arbitrary undirected graphs. Second-order stationary graph signals may be obtained by graph filtering zero-mean white noise and they admit a well-defined power spectrum whose shape is determined by the frequency response of the graph filter. Estimating the graph power spectrum forms an important component of stationary graph signal processing and related inference tasks such as Wiener prediction or inpainting on graphs. The central result of this paper is that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the second-order statistics of the graph signal from the subsampled observations, and more importantly, without any spectral priors. To this end, both a nonparametric approach as well as parametric approaches including moving average and autoregressive models for the graph power spectrum are considered. The results specialize for undirected circulant graphs in that the graph nodes leading to the best compression rates are given by the so-called minimal sparse rulers. A near-optimal greedy algorithm is developed to design the subsampling scheme for the non-parametric and the moving average models, whereas a particular subsampling scheme that allows linear estimation for the autoregressive model is proposed. Numerical experiments on synthetic as well as real datasets related to climatology and processing handwritten digits are provided to demonstrate the developed theory.
Canonical Labelling of Site Graphs
Directory of Open Access Journals (Sweden)
Nicolas Oury
2013-06-01
Full Text Available We investigate algorithms for canonical labelling of site graphs, i.e. graphs in which edges bind vertices on sites with locally unique names. We first show that the problem of canonical labelling of site graphs reduces to the problem of canonical labelling of graphs with edge colourings. We then present two canonical labelling algorithms based on edge enumeration, and a third based on an extension of Hopcroft's partition refinement algorithm. All run in quadratic worst case time individually. However, one of the edge enumeration algorithms runs in sub-quadratic time for graphs with "many" automorphisms, and the partition refinement algorithm runs in sub-quadratic time for graphs with "few" bisimulation equivalences. This suite of algorithms was chosen based on the expectation that graphs fall in one of those two categories. If that is the case, a combined algorithm runs in sub-quadratic worst case time. Whether this expectation is reasonable remains an interesting open problem.
Narrative Collage of Image Collections by Scene Graph Recombination.
Fang, Fei; Yi, Miao; Feng, Hui; Hu, Shenghong; Xiao, Chunxia
2017-10-04
Narrative collage is an interesting image editing art to summarize the main theme or storyline behind an image collection. We present a novel method to generate narrative images with plausible semantic scene structures. To achieve this goal, we introduce a layer graph and a scene graph to represent relative depth order and semantic relationship between image objects, respectively. We firstly cluster the input image collection to select representative images, and then extract a group of semantic salient objects from each representative image. Both Layer graphs and scene graphs are constructed and combined according to our specific rules for reorganizing the extracted objects in every image. We design an energy model to appropriately locate every object on the final canvas. Experiment results show that our method can produce competitive narrative collage result and works well on a wide range of image collections.
Learning heat diffusion graphs
Thanou, Dorina; Dong, Xiaowen; Kressner, Daniel; Frossard, Pascal
2016-01-01
Effective information analysis generally boils down to properly identifying the structure or geometry of the data, which is often represented by a graph. In some applications, this structure may be partly determined by design constraints or pre-determined sensing arrangements, like in road transportation networks for example. In general though, the data structure is not readily available and becomes pretty difficult to define. In particular, the global smoothness assumptions, that most of the...
Syed, M. Qasim; Lovatt, Ian
2014-01-01
This paper is an addition to the series of papers on the exponential function begun by Albert Bartlett. In particular, we ask how the graph of the exponential function y = e[superscript -t/t] would appear if y were plotted versus ln t rather than the normal practice of plotting ln y versus t. In answering this question, we find a new way to…
Understanding Charts and Graphs.
1987-07-28
Farenheit degrees, which have no Onaturalo zero ); finally, ratio scales have numbers that are ordered so that the magnitudes of differences are important and...system. They have to do with the very nature of how marks serve as meaningful symbols. In the ideal case, a chart or graph will be absolutely unambiguous...and these laws comprise this principle (see Stevens, 1974). Absolute discriminability: A minimal magnitude of a mark is necessary for it to be detected
Yuan, Ying; MacKinnon, David P.
2009-01-01
In this article, we propose Bayesian analysis of mediation effects. Compared with conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian…
Kleibergen, F.R.; Kleijn, R.; Paap, R.
2000-01-01
We propose a novel Bayesian test under a (noninformative) Jeffreys'priorspecification. We check whether the fixed scalar value of the so-calledBayesian Score Statistic (BSS) under the null hypothesis is aplausiblerealization from its known and standardized distribution under thealternative. Unlike
Reproducibility of graph metrics in fMRI networks
Directory of Open Access Journals (Sweden)
Qawi K Telesford
2010-12-01
Full Text Available The reliability of graph metrics calculated in network analysis is essential to the interpretation of complex network organization. These graph metrics are used to deduce the small-world properties in networks. In this study, we investigated the test-retest reliability of graph metrics from functional magnetic resonance imaging (fMRI data collected for two runs in 45 healthy older adults. Graph metrics were calculated on data for both runs and compared using intraclass correlation coefficient (ICC statistics and Bland-Altman (BA plots. ICC scores describe the level of absolute agreement between two measurements and provide a measure of reproducibility. For mean graph metrics, ICC scores were high for clustering coefficient (ICC=0.86, global efficiency (ICC=0.83, path length (ICC=0.79, and local efficiency (ICC=0.75; the ICC score for degree was found to be low (ICC=0.29. ICC scores were also used to generate reproducibility maps in brain space to test voxel-wise reproducibility for unsmoothed and smoothed data. Reproducibility was uniform across the brain for global efficiency and path length, but was only high in network hubs for clustering coefficient, local efficiency and degree. BA plots were used to test the measurement repeatability of all graph metrics. All graph metrics fell within the limits for repeatability. Together, these results suggest that with exception of degree, mean graph metrics are reproducible and suitable for clinical studies. Further exploration is warranted to better understand reproducibility across the brain on a voxel-wise basis.
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Graphs cospectral with a friendship graph or its complement
Directory of Open Access Journals (Sweden)
Alireza Abdollahi
2013-12-01
Full Text Available Let $n$ be any positive integer and let $F_n$ be the friendship (or Dutch windmill graph with $2n+1$ vertices and $3n$ edges. Here we study graphs with the same adjacency spectrum as the $F_n$. Two graphs are called cospectral if the eigenvalues multiset of their adjacency matrices are the same. Let $G$ be a graph cospectral with $F_n$. Here we prove that if $G$ has no cycle of length $4$ or $5$, then $Gcong F_n$. Moreover if $G$ is connected and planar then $Gcong F_n$.All but one of connected components of $G$ are isomorphic to $K_2$.The complement $overline{F_n}$ of the friendship graph is determined by its adjacency eigenvalues, that is, if $overline{F_n}$ is cospectral with a graph $H$, then $Hcong overline{F_n}$.
Decomposing Oriented Graphs into Six Locally Irregular Oriented Graphs
DEFF Research Database (Denmark)
Bensmail, Julien; Renault, Gabriel
2016-01-01
An undirected graph G is locally irregular if every two of its adjacent vertices have distinct degrees. We say that G is decomposable into k locally irregular graphs if there exists a partition E1∪E2∪⋯∪Ek of the edge set E(G) such that each Ei induces a locally irregular graph. It was recently co...
X-Graphs: Language and Algorithms for Heterogeneous Graph Streams
2017-09-01
are widely used by academia and industry. 15. SUBJECT TERMS Data Analytics, Graph Analytics, High-Performance Computing 16. SECURITY CLASSIFICATION...form the core of the DeepDive Knowledge Construction System. 2 INTRODUCTION The goal of the X-Graphs project was to develop computational techniques...memory multicore machine. Ringo is based on Snap.py and SNAP, and uses Python . Ringo now allows the integration of Delite DSL Framework Graph
Graph theoretical model of a sensorimotor connectome in zebrafish.
Stobb, Michael; Peterson, Joshua M; Mazzag, Borbala; Gahtan, Ethan
2012-01-01
Mapping the detailed connectivity patterns (connectomes) of neural circuits is a central goal of neuroscience. The best quantitative approach to analyzing connectome data is still unclear but graph theory has been used with success. We present a graph theoretical model of the posterior lateral line sensorimotor pathway in zebrafish. The model includes 2,616 neurons and 167,114 synaptic connections. Model neurons represent known cell types in zebrafish larvae, and connections were set stochastically following rules based on biological literature. Thus, our model is a uniquely detailed computational representation of a vertebrate connectome. The connectome has low overall connection density, with 2.45% of all possible connections, a value within the physiological range. We used graph theoretical tools to compare the zebrafish connectome graph to small-world, random and structured random graphs of the same size. For each type of graph, 100 randomly generated instantiations were considered. Degree distribution (the number of connections per neuron) varied more in the zebrafish graph than in same size graphs with less biological detail. There was high local clustering and a short average path length between nodes, implying a small-world structure similar to other neural connectomes and complex networks. The graph was found not to be scale-free, in agreement with some other neural connectomes. An experimental lesion was performed that targeted three model brain neurons, including the Mauthner neuron, known to control fast escape turns. The lesion decreased the number of short paths between sensory and motor neurons analogous to the behavioral effects of the same lesion in zebrafish. This model is expandable and can be used to organize and interpret a growing database of information on the zebrafish connectome.
Graph theoretical model of a sensorimotor connectome in zebrafish.
Directory of Open Access Journals (Sweden)
Michael Stobb
Full Text Available Mapping the detailed connectivity patterns (connectomes of neural circuits is a central goal of neuroscience. The best quantitative approach to analyzing connectome data is still unclear but graph theory has been used with success. We present a graph theoretical model of the posterior lateral line sensorimotor pathway in zebrafish. The model includes 2,616 neurons and 167,114 synaptic connections. Model neurons represent known cell types in zebrafish larvae, and connections were set stochastically following rules based on biological literature. Thus, our model is a uniquely detailed computational representation of a vertebrate connectome. The connectome has low overall connection density, with 2.45% of all possible connections, a value within the physiological range. We used graph theoretical tools to compare the zebrafish connectome graph to small-world, random and structured random graphs of the same size. For each type of graph, 100 randomly generated instantiations were considered. Degree distribution (the number of connections per neuron varied more in the zebrafish graph than in same size graphs with less biological detail. There was high local clustering and a short average path length between nodes, implying a small-world structure similar to other neural connectomes and complex networks. The graph was found not to be scale-free, in agreement with some other neural connectomes. An experimental lesion was performed that targeted three model brain neurons, including the Mauthner neuron, known to control fast escape turns. The lesion decreased the number of short paths between sensory and motor neurons analogous to the behavioral effects of the same lesion in zebrafish. This model is expandable and can be used to organize and interpret a growing database of information on the zebrafish connectome.
Mizan: Optimizing Graph Mining in Large Parallel Systems
Kalnis, Panos
2012-03-01
Extracting information from graphs, from nding shortest paths to complex graph mining, is essential for many ap- plications. Due to the shear size of modern graphs (e.g., social networks), processing must be done on large paral- lel computing infrastructures (e.g., the cloud). Earlier ap- proaches relied on the MapReduce framework, which was proved inadequate for graph algorithms. More recently, the message passing model (e.g., Pregel) has emerged. Although the Pregel model has many advantages, it is agnostic to the graph properties and the architecture of the underlying com- puting infrastructure, leading to suboptimal performance. In this paper, we propose Mizan, a layer between the users\\' code and the computing infrastructure. Mizan considers the structure of the input graph and the architecture of the in- frastructure in order to: (i) decide whether it is bene cial to generate a near-optimal partitioning of the graph in a pre- processing step, and (ii) choose between typical point-to- point message passing and a novel approach that puts com- puting nodes in a virtual overlay ring. We deployed Mizan on a small local Linux cluster, on the cloud (256 virtual machines in Amazon EC2), and on an IBM Blue Gene/P supercomputer (1024 CPUs). We show that Mizan executes common algorithms on very large graphs 1-2 orders of mag- nitude faster than MapReduce-based implementations and up to one order of magnitude faster than implementations relying on Pregel-like hash-based graph partitioning.
A CLASSIFIER SYSTEM USING SMOOTH GRAPH COLORING
Directory of Open Access Journals (Sweden)
JORGE FLORES CRUZ
2017-01-01
Full Text Available Unsupervised classifiers allow clustering methods with less or no human intervention. Therefore it is desirable to group the set of items with less data processing. This paper proposes an unsupervised classifier system using the model of soft graph coloring. This method was tested with some classic instances in the literature and the results obtained were compared with classifications made with human intervention, yielding as good or better results than supervised classifiers, sometimes providing alternative classifications that considers additional information that humans did not considered.
Wagstaff, Kiri L.
2012-03-01
matrices—cases in which only pairwise information is known. The list of algorithms covered in this chapter is representative of those most commonly in use, but it is by no means comprehensive. There is an extensive collection of existing books on clustering that provide additional background and depth. Three early books that remain useful today are Anderberg’s Cluster Analysis for Applications [3], Hartigan’s Clustering Algorithms [25], and Gordon’s Classification [22]. The latter covers basics on similarity measures, partitioning and hierarchical algorithms, fuzzy clustering, overlapping clustering, conceptual clustering, validations methods, and visualization or data reduction techniques such as principal components analysis (PCA),multidimensional scaling, and self-organizing maps. More recently, Jain et al. provided a useful and informative survey [27] of a variety of different clustering algorithms, including those mentioned here as well as fuzzy, graph-theoretic, and evolutionary clustering. Everitt’s Cluster Analysis [19] provides a modern overview of algorithms, similarity measures, and evaluation methods.
PERSEUS-HUB: Interactive and Collective Exploration of Large-Scale Graphs
Directory of Open Access Journals (Sweden)
Di Jin
2017-07-01
Full Text Available Graphs emerge naturally in many domains, such as social science, neuroscience, transportation engineering, and more. In many cases, such graphs have millions or billions of nodes and edges, and their sizes increase daily at a fast pace. How can researchers from various domains explore large graphs interactively and efficiently to find out what is ‘important’? How can multiple researchers explore a new graph dataset collectively and “help” each other with their findings? In this article, we present Perseus-Hub, a large-scale graph mining tool that computes a set of graph properties in a distributed manner, performs ensemble, multi-view anomaly detection to highlight regions that are worth investigating, and provides users with uncluttered visualization and easy interaction with complex graph statistics. Perseus-Hub uses a Spark cluster to calculate various statistics of large-scale graphs efficiently, and aggregates the results in a summary on the master node to support interactive user exploration. In Perseus-Hub, the visualized distributions of graph statistics provide preliminary analysis to understand a graph. To perform a deeper analysis, users with little prior knowledge can leverage patterns (e.g., spikes in the power-law degree distribution marked by other users or experts. Moreover, Perseus-Hub guides users to regions of interest by highlighting anomalous nodes and helps users establish a more comprehensive understanding about the graph at hand. We demonstrate our system through the case study on real, large-scale networks.
Bayesian Methods for Radiation Detection and Dosimetry
International Nuclear Information System (INIS)
Peter G. Groer
2002-01-01
We performed work in three areas: radiation detection, external and internal radiation dosimetry. In radiation detection we developed Bayesian techniques to estimate the net activity of high and low activity radioactive samples. These techniques have the advantage that the remaining uncertainty about the net activity is described by probability densities. Graphs of the densities show the uncertainty in pictorial form. Figure 1 below demonstrates this point. We applied stochastic processes for a method to obtain Bayesian estimates of 222Rn-daughter products from observed counting rates. In external radiation dosimetry we studied and developed Bayesian methods to estimate radiation doses to an individual with radiation induced chromosome aberrations. We analyzed chromosome aberrations after exposure to gammas and neutrons and developed a method for dose-estimation after criticality accidents. The research in internal radiation dosimetry focused on parameter estimation for compartmental models from observed compartmental activities. From the estimated probability densities of the model parameters we were able to derive the densities for compartmental activities for a two compartment catenary model at different times. We also calculated the average activities and their standard deviation for a simple two compartment model
Lawson, Andrew B
2002-01-01
Research has generated a number of advances in methods for spatial cluster modelling in recent years, particularly in the area of Bayesian cluster modelling. Along with these advances has come an explosion of interest in the potential applications of this work, especially in epidemiology and genome research. In one integrated volume, this book reviews the state-of-the-art in spatial clustering and spatial cluster modelling, bringing together research and applications previously scattered throughout the literature. It begins with an overview of the field, then presents a series of chapters that illuminate the nature and purpose of cluster modelling within different application areas, including astrophysics, epidemiology, ecology, and imaging. The focus then shifts to methods, with discussions on point and object process modelling, perfect sampling of cluster processes, partitioning in space and space-time, spatial and spatio-temporal process modelling, nonparametric methods for clustering, and spatio-temporal ...
Label-based routing for a family of scale-free, modular, planar and unclustered graphs
International Nuclear Information System (INIS)
Comellas, Francesc; Miralles, Alicia
2011-01-01
We give an optimal labeling and routing algorithm for a family of scale-free, modular and planar graphs with zero clustering. The relevant properties of this family match those of some networks associated with technological and biological systems with a low clustering, including some electronic circuits and protein networks. The existence of an efficient routing protocol for this graph model should help when designing communication algorithms in real networks and also in the understanding of their dynamic processes.
Endomorphisms of graph algebras
DEFF Research Database (Denmark)
Conti, Roberto; Hong, Jeong Hee; Szymanski, Wojciech
2012-01-01
We initiate a systematic investigation of endomorphisms of graph C*-algebras C*(E), extending several known results on endomorphisms of the Cuntz algebras O_n. Most but not all of this study is focused on endomorphisms which permute the vertex projections and globally preserve the diagonal MASA D...... that the restriction to the diagonal MASA of an automorphism which globally preserves both D_E and the core AF-subalgebra eventually commutes with the corresponding one-sided shift. Secondly, we exhibit several properties of proper endomorphisms, investigate invertibility of localized endomorphisms both on C...
Yap, Hian-Poh
1996-01-01
This book provides an up-to-date and rapid introduction to an important and currently active topic in graph theory. The author leads the reader to the forefront of research in this area. Complete and easily readable proofs of all the main theorems, together with numerous examples, exercises and open problems are given. The book is suitable for use as a textbook or as seminar material for advanced undergraduate and graduate students. The references are comprehensive and so it will also be useful for researchers as a handbook.
Graph Algorithm Animation with Grrr
Rodgers, Peter; Vidal, Natalia
2000-01-01
We discuss geometric positioning, highlighting of visited nodes and user defined highlighting that form the algorithm animation facilities in the Grrr graph rewriting programming language. The main purpose of animation was initially for the debugging and profiling of Grrr code, but recently it has been extended for the purpose of teaching algorithms to undergraduate students. The animation is restricted to graph based algorithms such as graph drawing, list manipulation or more traditional gra...
Albert, Jim
2009-01-01
There has been a dramatic growth in the development and application of Bayesian inferential methods. Some of this growth is due to the availability of powerful simulation-based algorithms to summarize posterior distributions. There has been also a growing interest in the use of the system R for statistical analyses. R's open source nature, free availability, and large number of contributor packages have made R the software of choice for many statisticians in education and industry. Bayesian Computation with R introduces Bayesian modeling by the use of computation using the R language. The earl
Data clustering algorithms and applications
Aggarwal, Charu C
2013-01-01
Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.The book focuses on three primary aspects of data clustering: Methods, describing key techniques commonly used for clustering, such as fea
Optimization Problems on Threshold Graphs
Directory of Open Access Journals (Sweden)
Elena Nechita
2010-06-01
Full Text Available During the last three decades, different types of decompositions have been processed in the field of graph theory. Among these we mention: decompositions based on the additivity of some characteristics of the graph, decompositions where the adjacency law between the subsets of the partition is known, decompositions where the subgraph induced by every subset of the partition must have predeterminate properties, as well as combinations of such decompositions. In this paper we characterize threshold graphs using the weakly decomposition, determine: density and stability number, Wiener index and Wiener polynomial for threshold graphs.
Eulerian Graphs and Related Topics
Fleischner, Herbert
1990-01-01
The two volumes comprising Part 1 of this work embrace the theme of Eulerian trails and covering walks. They should appeal both to researchers and students, as they contain enough material for an undergraduate or graduate graph theory course which emphasizes Eulerian graphs, and thus can be read by any mathematician not yet familiar with graph theory. But they are also of interest to researchers in graph theory because they contain many recent results, some of which are only partial solutions to more general problems. A number of conjectures have been included as well. Various problems (such a
Bayesian data analysis for newcomers.
Kruschke, John K; Liddell, Torrin M
2018-02-01
This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.
Energy Technology Data Exchange (ETDEWEB)
Maunz, Peter Lukas Wilhelm [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sterk, Jonathan David [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Lobser, Daniel [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Parekh, Ojas D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ryan-Anderson, Ciaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2016-01-01
In recent years, advanced network analytics have become increasingly important to na- tional security with applications ranging from cyber security to detection and disruption of ter- rorist networks. While classical computing solutions have received considerable investment, the development of quantum algorithms to address problems, such as data mining of attributed relational graphs, is a largely unexplored space. Recent theoretical work has shown that quan- tum algorithms for graph analysis can be more efficient than their classical counterparts. Here, we have implemented a trapped-ion-based two-qubit quantum information proces- sor to address these goals. Building on Sandia's microfabricated silicon surface ion traps, we have designed, realized and characterized a quantum information processor using the hyperfine qubits encoded in two 171 Yb + ions. We have implemented single qubit gates using resonant microwave radiation and have employed Gate set tomography (GST) to characterize the quan- tum process. For the first time, we were able to prove that the quantum process surpasses the fault tolerance thresholds of some quantum codes by demonstrating a diamond norm distance of less than 1 . 9 x 10 [?] 4 . We used Raman transitions in order to manipulate the trapped ions' motion and realize two-qubit gates. We characterized the implemented motion sensitive and insensitive single qubit processes and achieved a maximal process infidelity of 6 . 5 x 10 [?] 5 . We implemented the two-qubit gate proposed by Molmer and Sorensen and achieved a fidelity of more than 97 . 7%.
Generating hierarchial scale-free graphs from fractals
Energy Technology Data Exchange (ETDEWEB)
Komjathy, Julia, E-mail: komyju@math.bme.hu [Department of Stochastics, Institute of Mathematics, Technical University of Budapest, H-1529 P.O. Box 91 (Hungary); Simon, Karoly, E-mail: simonk@math.bme.hu [Department of Stochastics, Institute of Mathematics, Technical University of Budapest, H-1529 P.O. Box 91 (Hungary)
2011-08-15
Highlights: > We generate deterministic scale-free networks using graph-directed self similar IFS. > Our model exhibits similar clustering, power law decay properties to real networks. > The average length of shortest path and the diameter of the graph are determined. > Using this model, we generate random graphs with prescribed power law exponent. - Abstract: Motivated by the hierarchial network model of E. Ravasz, A.-L. Barabasi, and T. Vicsek, we introduce deterministic scale-free networks derived from a graph directed self-similar fractal {Lambda}. With rigorous mathematical results we verify that our model captures some of the most important features of many real networks: the scale-free and the high clustering properties. We also prove that the diameter is the logarithm of the size of the system. We point out a connection between the power law exponent of the degree distribution and some intrinsic geometric measure theoretical properties of the underlying fractal. Using our (deterministic) fractal {Lambda} we generate random graph sequence sharing similar properties.
Bayesian methods for data analysis
Carlin, Bradley P.
2009-01-01
Approaches for statistical inference Introduction Motivating Vignettes Defining the Approaches The Bayes-Frequentist Controversy Some Basic Bayesian Models The Bayes approach Introduction Prior Distributions Bayesian Inference Hierarchical Modeling Model Assessment Nonparametric Methods Bayesian computation Introduction Asymptotic Methods Noniterative Monte Carlo Methods Markov Chain Monte Carlo Methods Model criticism and selection Bayesian Modeling Bayesian Robustness Model Assessment Bayes Factors via Marginal Density Estimation Bayes Factors
Asymptote Misconception on Graphing Functions: Does Graphing Software Resolve It?
Directory of Open Access Journals (Sweden)
Mehmet Fatih Öçal
2017-01-01
Full Text Available Graphing function is an important issue in mathematics education due to its use in various areas of mathematics and its potential roles for students to enhance learning mathematics. The use of some graphing software assists students’ learning during graphing functions. However, the display of graphs of functions that students sketched by hand may be relatively different when compared to the correct forms sketched using graphing software. The possible misleading effects of this situation brought a discussion of a misconception (asymptote misconception on graphing functions. The purpose of this study is two- fold. First of all, this study investigated whether using graphing software (GeoGebra in this case helps students to determine and resolve this misconception in calculus classrooms. Second, the reasons for this misconception are sought. The multiple case study was utilized in this study. University students in two calculus classrooms who received instructions with (35 students or without GeoGebra assisted instructions (32 students were compared according to whether they fell into this misconception on graphing basic functions (1/x, lnx, ex. In addition, students were interviewed to reveal the reasons behind this misconception. Data were analyzed by means of descriptive and content analysis methods. The findings indicated that those who received GeoGebra assisted instruction were better in resolving it. In addition, the reasons behind this misconception were found to be teacher-based, exam-based and some other factors.
Noncausal Bayesian Vector Autoregression
DEFF Research Database (Denmark)
Lanne, Markku; Luoto, Jani
We propose a Bayesian inferential procedure for the noncausal vector autoregressive (VAR) model that is capable of capturing nonlinearities and incorporating effects of missing variables. In particular, we devise a fast and reliable posterior simulator that yields the predictive distribution...
Statistics: a Bayesian perspective
National Research Council Canada - National Science Library
Berry, Donald A
1996-01-01
...: it is the only introductory textbook based on Bayesian ideas, it combines concepts and methods, it presents statistics as a means of integrating data into the significant process, it develops ideas...
Fox, Gerardus J.A.; van den Berg, Stéphanie Martine; Veldkamp, Bernard P.; Irwing, P.; Booth, T.; Hughes, D.
2015-01-01
In educational and psychological studies, psychometric methods are involved in the measurement of constructs, and in constructing and validating measurement instruments. Assessment results are typically used to measure student proficiency levels and test characteristics. Recently, Bayesian item
ON BIPOLAR SINGLE VALUED NEUTROSOPHIC GRAPHS
Said Broumi; Mohamed Talea; Assia Bakali; Florentin Smarandache
2016-01-01
In this article, we combine the concept of bipolar neutrosophic set and graph theory. We introduce the notions of bipolar single valued neutrosophic graphs, strong bipolar single valued neutrosophic graphs, complete bipolar single valued neutrosophic graphs, regular bipolar single valued neutrosophic graphs and investigate some of their related properties.
Visualizing Dataflow Graphs of Deep Learning Models in TensorFlow.
Wongsuphasawat, Kanit; Smilkov, Daniel; Wexler, James; Wilson, Jimbo; Mane, Dandelion; Fritz, Doug; Krishnan, Dilip; Viegas, Fernanda B; Wattenberg, Martin
2018-01-01
We present a design study of the TensorFlow Graph Visualizer, part of the TensorFlow machine intelligence platform. This tool helps users understand complex machine learning architectures by visualizing their underlying dataflow graphs. The tool works by applying a series of graph transformations that enable standard layout techniques to produce a legible interactive diagram. To declutter the graph, we decouple non-critical nodes from the layout. To provide an overview, we build a clustered graph using the hierarchical structure annotated in the source code. To support exploration of nested structure on demand, we perform edge bundling to enable stable and responsive cluster expansion. Finally, we detect and highlight repeated structures to emphasize a model's modular composition. To demonstrate the utility of the visualizer, we describe example usage scenarios and report user feedback. Overall, users find the visualizer useful for understanding, debugging, and sharing the structures of their models.
Belief propagation and loop series on planar graphs
International Nuclear Information System (INIS)
Chertkov, Michael; Teodorescu, Razvan; Chernyak, Vladimir Y
2008-01-01
We discuss a generic model of Bayesian inference with binary variables defined on edges of a planar graph. The Loop Calculus approach of Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R) [cond-mat/0601487]; 2006 J. Stat. Mech. P06009 [cond-mat/0603189]) is used to evaluate the resulting series expansion for the partition function. We show that, for planar graphs, truncating the series at single-connected loops reduces, via a map reminiscent of the Fisher transformation (Fisher 1961 Phys. Rev. 124 1664), to evaluating the partition function of the dimer-matching model on an auxiliary planar graph. Thus, the truncated series can be easily re-summed, using the Pfaffian formula of Kasteleyn (1961 Physics 27 1209). This allows us to identify a big class of computationally tractable planar models reducible to a dimer model via the Belief Propagation (gauge) transformation. The Pfaffian representation can also be extended to the full Loop Series, in which case the expansion becomes a sum of Pfaffian contributions, each associated with dimer matchings on an extension to a subgraph of the original graph. Algorithmic consequences of the Pfaffian representation, as well as relations to quantum and non-planar models, are discussed
Eulerian graph embeddings and trails confined to lattice tubes
International Nuclear Information System (INIS)
Soteros, C E
2006-01-01
Embeddings of graphs in sublattices of the square and simple cubic lattice known as tubes (or prisms) are considered. For such sublattices, two combinatorial bounds are obtained which each relate the number of embeddings of all closed eulerian graphs with k branch points (vertices of degree greater than two) to the number of self-avoiding polygons. From these bounds it is proved that the entropic critical exponent for the number of embeddings of closed eulerian graphs with k branch points is equal to k, and the entropic critical exponent for the number of closed trails with k branch points is equal to k + 1. One of the required combinatorial bounds is obtained via Madras' 1999 lattice cluster pattern theorem, which yields a bound on the number of ways to convert a self-avoiding polygon into a closed eulerian graph embedding with k branch points. The other combinatorial bound is established by constructing a method for sequentially removing branch points from a closed eulerian graph embedding; this yields a bound on the number of ways to convert a closed eulerian graph embedding into a self-avoiding polygon
Regular graph construction for semi-supervised learning
International Nuclear Information System (INIS)
Vega-Oliveros, Didier A; Berton, Lilian; Eberle, Andre Mantini; Lopes, Alneu de Andrade; Zhao, Liang
2014-01-01
Semi-supervised learning (SSL) stands out for using a small amount of labeled points for data clustering and classification. In this scenario graph-based methods allow the analysis of local and global characteristics of the available data by identifying classes or groups regardless data distribution and representing submanifold in Euclidean space. Most of methods used in literature for SSL classification do not worry about graph construction. However, regular graphs can obtain better classification accuracy compared to traditional methods such as k-nearest neighbor (kNN), since kNN benefits the generation of hubs and it is not appropriate for high-dimensionality data. Nevertheless, methods commonly used for generating regular graphs have high computational cost. We tackle this problem introducing an alternative method for generation of regular graphs with better runtime performance compared to methods usually find in the area. Our technique is based on the preferential selection of vertices according some topological measures, like closeness, generating at the end of the process a regular graph. Experiments using the global and local consistency method for label propagation show that our method provides better or equal classification rate in comparison with kNN
Graph Theory. 2. Vertex Descriptors and Graph Coloring
Directory of Open Access Journals (Sweden)
Lorentz JÄNTSCHI
2002-12-01
Full Text Available This original work presents the construction of a set of ten sequence matrices and their applications for ordering vertices in graphs. For every sequence matrix three ordering criteria are applied: lexicographic ordering, based on strings of numbers, corresponding to every vertex, extracted as rows from sequence matrices; ordering by the sum of path lengths from a given vertex; and ordering by the sum of paths, starting from a given vertex. We also examine a graph that has different orderings for the above criteria. We then proceed to demonstrate that every criterion induced its own partition of graph vertex. We propose the following theoretical result: both LAVS and LVDS criteria generate identical partitioning of vertices in any graph. Finally, a coloring of graph vertices according to introduced ordering criteria was proposed.
Bayesian Networks An Introduction
Koski, Timo
2009-01-01
Bayesian Networks: An Introduction provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout. Features include:.: An introduction to Dirichlet Distribution, Exponential Families and their applications.; A detailed description of learni
Maeda, Shin-ichi
2014-01-01
Dropout is one of the key techniques to prevent the learning from overfitting. It is explained that dropout works as a kind of modified L2 regularization. Here, we shed light on the dropout from Bayesian standpoint. Bayesian interpretation enables us to optimize the dropout rate, which is beneficial for learning of weight parameters and prediction after learning. The experiment result also encourages the optimization of the dropout.
On an edge partition and root graphs of some classes of line graphs
Directory of Open Access Journals (Sweden)
K Pravas
2017-04-01
Full Text Available The Gallai and the anti-Gallai graphs of a graph $G$ are complementary pairs of spanning subgraphs of the line graph of $G$. In this paper we find some structural relations between these graph classes by finding a partition of the edge set of the line graph of a graph $G$ into the edge sets of the Gallai and anti-Gallai graphs of $G$. Based on this, an optimal algorithm to find the root graph of a line graph is obtained. Moreover, root graphs of diameter-maximal, distance-hereditary, Ptolemaic and chordal graphs are also discussed.
International Nuclear Information System (INIS)
Samatova, N F; Schmidt, M C; Hendrix, W; Breimyer, P; Thomas, K; Park, B-H
2008-01-01
Data-driven construction of predictive models for biological systems faces challenges from data intensity, uncertainty, and computational complexity. Data-driven model inference is often considered a combinatorial graph problem where an enumeration of all feasible models is sought. The data-intensive and the NP-hard nature of such problems, however, challenges existing methods to meet the required scale of data size and uncertainty, even on modern supercomputers. Maximal clique enumeration (MCE) in a graph derived from such biological data is often a rate-limiting step in detecting protein complexes in protein interaction data, finding clusters of co-expressed genes in microarray data, or identifying clusters of orthologous genes in protein sequence data. We report two key advances that address this challenge. We designed and implemented the first (to the best of our knowledge) parallel MCE algorithm that scales linearly on thousands of processors running MCE on real-world biological networks with thousands and hundreds of thousands of vertices. In addition, we proposed and developed the Graph Perturbation Theory (GPT) that establishes a foundation for efficiently solving the MCE problem in perturbed graphs, which model the uncertainty in the data. GPT formulates necessary and sufficient conditions for detecting the differences between the sets of maximal cliques in the original and perturbed graphs and reduces the enumeration time by more than 80% compared to complete recomputation
Effective Bayesian Transfer Learning
2010-03-01
the class of algorithms analyzed by Bartlett’s work under task R4). With emphasis on transferring one type of objects to another, (e.g., coffee cups...obstacles, so those are temporarily pruned from the graph. In addition, the start and goal locations may not be currently included in the graph. They carried...Transfer Level 3 Varying shape within class Task A: Instances of an object from the same class. ( Coffee mug) Task B: Instances of a different object
Ghosh, Sujit K
2010-01-01
Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.
Bayesian Estimation and Inference using Stochastic Hardware
Directory of Open Access Journals (Sweden)
Chetan Singh Thakur
2016-03-01
Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
The planar cubic Cayley graphs
Georgakopoulos, Agelos
2018-01-01
The author obtains a complete description of the planar cubic Cayley graphs, providing an explicit presentation and embedding for each of them. This turns out to be a rich class, comprising several infinite families. He obtains counterexamples to conjectures of Mohar, Bonnington and Watkins. The author's analysis makes the involved graphs accessible to computation, corroborating a conjecture of Droms.
Groupies in random bipartite graphs
Yilun Shang
2010-01-01
A vertex $v$ of a graph $G$ is called a groupie if its degree is notless than the average of the degrees of its neighbors. In thispaper we study the influence of bipartition $(B_1,B_2)$ on groupiesin random bipartite graphs $G(B_1,B_2,p)$ with both fixed $p$ and$p$ tending to zero.
Nested Dynamic Condition Response Graphs
DEFF Research Database (Denmark)
Hildebrandt, Thomas; Mukkamala, Raghava Rao; Slaats, Tijs
2012-01-01
We present an extension of the recently introduced declarative process model Dynamic Condition Response Graphs ( DCR Graphs) to allow nested subgraphs and a new milestone relation between events. The extension was developed during a case study carried out jointly with our industrial partner...
Graph Sampling for Covariance Estimation
Chepuri, Sundeep Prabhakar; Leus, Geert
2017-01-01
specialize for undirected circulant graphs in that the graph nodes leading to the best compression rates are given by the so-called minimal sparse rulers. A near-optimal greedy algorithm is developed to design the subsampling scheme for the non
Bayesian network learning for natural hazard assessments
Vogel, Kristin
2016-04-01
Even though quite different in occurrence and consequences, from a modelling perspective many natural hazards share similar properties and challenges. Their complex nature as well as lacking knowledge about their driving forces and potential effects make their analysis demanding. On top of the uncertainty about the modelling framework, inaccurate or incomplete event observations and the intrinsic randomness of the natural phenomenon add up to different interacting layers of uncertainty, which require a careful handling. Thus, for reliable natural hazard assessments it is crucial not only to capture and quantify involved uncertainties, but also to express and communicate uncertainties in an intuitive way. Decision-makers, who often find it difficult to deal with uncertainties, might otherwise return to familiar (mostly deterministic) proceedings. In the scope of the DFG research training group „NatRiskChange" we apply the probabilistic framework of Bayesian networks for diverse natural hazard and vulnerability studies. The great potential of Bayesian networks was already shown in previous natural hazard assessments. Treating each model component as random variable, Bayesian networks aim at capturing the joint distribution of all considered variables. Hence, each conditional distribution of interest (e.g. the effect of precautionary measures on damage reduction) can be inferred. The (in-)dependencies between the considered variables can be learned purely data driven or be given by experts. Even a combination of both is possible. By translating the (in-)dependences into a graph structure, Bayesian networks provide direct insights into the workings of the system and allow to learn about the underlying processes. Besides numerous studies on the topic, learning Bayesian networks from real-world data remains challenging. In previous studies, e.g. on earthquake induced ground motion and flood damage assessments, we tackled the problems arising with continuous variables
Planar graphs theory and algorithms
Nishizeki, T
1988-01-01
Collected in this volume are most of the important theorems and algorithms currently known for planar graphs, together with constructive proofs for the theorems. Many of the algorithms are written in Pidgin PASCAL, and are the best-known ones; the complexities are linear or 0(nlogn). The first two chapters provide the foundations of graph theoretic notions and algorithmic techniques. The remaining chapters discuss the topics of planarity testing, embedding, drawing, vertex- or edge-coloring, maximum independence set, subgraph listing, planar separator theorem, Hamiltonian cycles, and single- or multicommodity flows. Suitable for a course on algorithms, graph theory, or planar graphs, the volume will also be useful for computer scientists and graph theorists at the research level. An extensive reference section is included.
Quantum chaos on discrete graphs
International Nuclear Information System (INIS)
Smilansky, Uzy
2007-01-01
Adapting a method developed for the study of quantum chaos on quantum (metric) graphs (Kottos and Smilansky 1997 Phys. Rev. Lett. 79 4794, Kottos and Smilansky 1999 Ann. Phys., NY 274 76), spectral ζ functions and trace formulae for discrete Laplacians on graphs are derived. This is achieved by expressing the spectral secular equation in terms of the periodic orbits of the graph and obtaining functions which belong to the class of ζ functions proposed originally by Ihara (1966 J. Mat. Soc. Japan 18 219) and expanded by subsequent authors (Stark and Terras 1996 Adv. Math. 121 124, Kotani and Sunada 2000 J. Math. Sci. Univ. Tokyo 7 7). Finally, a model of 'classical dynamics' on the discrete graph is proposed. It is analogous to the corresponding classical dynamics derived for quantum graphs (Kottos and Smilansky 1997 Phys. Rev. Lett. 79 4794, Kottos and Smilansky 1999 Ann. Phys., NY 274 76). (fast track communication)
RJSplot: Interactive Graphs with R.
Barrios, David; Prieto, Carlos
2018-03-01
Data visualization techniques provide new methods for the generation of interactive graphs. These graphs allow a better exploration and interpretation of data but their creation requires advanced knowledge of graphical libraries. Recent packages have enabled the integration of interactive graphs in R. However, R provides limited graphical packages that allow the generation of interactive graphs for computational biology applications. The present project has joined the analytical power of R with the interactive graphical features of JavaScript in a new R package (RJSplot). It enables the easy generation of interactive graphs in R, provides new visualization capabilities, and contributes to the advance of computational biology analytical methods. At present, 16 interactive graphics are available in RJSplot, such as the genome viewer, Manhattan plots, 3D plots, heatmaps, dendrograms, networks, and so on. The RJSplot package is freely available online at http://rjsplot.net. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
On characterizing terrain visibility graphs
Directory of Open Access Journals (Sweden)
William Evans
2015-06-01
Full Text Available A terrain is an $x$-monotone polygonal line in the $xy$-plane. Two vertices of a terrain are mutually visible if and only if there is no terrain vertex on or above the open line segment connecting them. A graph whose vertices represent terrain vertices and whose edges represent mutually visible pairs of terrain vertices is called a terrain visibility graph. We would like to find properties that are both necessary and sufficient for a graph to be a terrain visibility graph; that is, we would like to characterize terrain visibility graphs.Abello et al. [Discrete and Computational Geometry, 14(3:331--358, 1995] showed that all terrain visibility graphs are “persistent”. They showed that the visibility information of a terrain point set implies some ordering requirements on the slopes of the lines connecting pairs of points in any realization, and as a step towards showing sufficiency, they proved that for any persistent graph $M$ there is a total order on the slopes of the (pseudo lines in a generalized configuration of points whose visibility graph is $M$.We give a much simpler proof of this result by establishing an orientation to every triple of vertices, reflecting some slope ordering requirements that are consistent with $M$ being the visibility graph, and prove that these requirements form a partial order. We give a faster algorithm to construct a total order on the slopes. Our approach attempts to clarify the implications of the graph theoretic properties on the ordering of the slopes, and may be interpreted as defining properties on an underlying oriented matroid that we show is a restricted type of $3$-signotope.
Modelling of JET diagnostics using Bayesian Graphical Models
Energy Technology Data Exchange (ETDEWEB)
Svensson, J. [IPP Greifswald, Greifswald (Germany); Ford, O. [Imperial College, London (United Kingdom); McDonald, D.; Hole, M.; Nessi, G. von; Meakins, A.; Brix, M.; Thomsen, H.; Werner, A.; Sirinelli, A.
2011-07-01
The mapping between physics parameters (such as densities, currents, flows, temperatures etc) defining the plasma 'state' under a given model and the raw observations of each plasma diagnostic will 1) depend on the particular physics model used, 2) is inherently probabilistic, from uncertainties on both observations and instrumental aspects of the mapping, such as calibrations, instrument functions etc. A flexible and principled way of modelling such interconnected probabilistic systems is through so called Bayesian graphical models. Being an amalgam between graph theory and probability theory, Bayesian graphical models can simulate the complex interconnections between physics models and diagnostic observations from multiple heterogeneous diagnostic systems, making it relatively easy to optimally combine the observations from multiple diagnostics for joint inference on parameters of the underlying physics model, which in itself can be represented as part of the graph. At JET about 10 diagnostic systems have to date been modelled in this way, and has lead to a number of new results, including: the reconstruction of the flux surface topology and q-profiles without any specific equilibrium assumption, using information from a number of different diagnostic systems; profile inversions taking into account the uncertainties in the flux surface positions and a substantial increase in accuracy of JET electron density and temperature profiles, including improved pedestal resolution, through the joint analysis of three diagnostic systems. It is believed that the Bayesian graph approach could potentially be utilised for very large sets of diagnostics, providing a generic data analysis framework for nuclear fusion experiments, that would be able to optimally utilize the information from multiple diagnostics simultaneously, and where the explicit graph representation of the connections to underlying physics models could be used for sophisticated model testing. This
Bayesian feature weighting for unsupervised learning, with application to object recognition
Carbonetto , Peter; De Freitas , Nando; Gustafson , Paul; Thompson , Natalie
2003-01-01
International audience; We present a method for variable selection/weighting in an unsupervised learning context using Bayesian shrinkage. The basis for the model parameters and cluster assignments can be computed simultaneous using an efficient EM algorithm. Applying our Bayesian shrinkage model to a complex problem in object recognition (Duygulu, Barnard, de Freitas and Forsyth 2002), our experiments yied good results.
Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications
International Nuclear Information System (INIS)
Mungan, Muhittin; Ramasco, José J
2010-01-01
Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks
Hierarchy of modular graph identities
International Nuclear Information System (INIS)
D’Hoker, Eric; Kaidi, Justin
2016-01-01
The low energy expansion of Type II superstring amplitudes at genus one is organized in terms of modular graph functions associated with Feynman graphs of a conformal scalar field on the torus. In earlier work, surprising identities between two-loop graphs at all weights, and between higher-loop graphs of weights four and five were constructed. In the present paper, these results are generalized in two complementary directions. First, all identities at weight six and all dihedral identities at weight seven are obtained and proven. Whenever the Laurent polynomial at the cusp is available, the form of these identities confirms the pattern by which the vanishing of the Laurent polynomial governs the full modular identity. Second, the family of modular graph functions is extended to include all graphs with derivative couplings and worldsheet fermions. These extended families of modular graph functions are shown to obey a hierarchy of inhomogeneous Laplace eigenvalue equations. The eigenvalues are calculated analytically for the simplest infinite sub-families and obtained by Maple for successively more complicated sub-families. The spectrum is shown to consist solely of eigenvalues s(s−1) for positive integers s bounded by the weight, with multiplicities which exhibit rich representation-theoretic patterns.
Semantic graphs and associative memories
Pomi, Andrés; Mizraji, Eduardo
2004-12-01
Graphs have been increasingly utilized in the characterization of complex networks from diverse origins, including different kinds of semantic networks. Human memories are associative and are known to support complex semantic nets; these nets are represented by graphs. However, it is not known how the brain can sustain these semantic graphs. The vision of cognitive brain activities, shown by modern functional imaging techniques, assigns renewed value to classical distributed associative memory models. Here we show that these neural network models, also known as correlation matrix memories, naturally support a graph representation of the stored semantic structure. We demonstrate that the adjacency matrix of this graph of associations is just the memory coded with the standard basis of the concept vector space, and that the spectrum of the graph is a code invariant of the memory. As long as the assumptions of the model remain valid this result provides a practical method to predict and modify the evolution of the cognitive dynamics. Also, it could provide us with a way to comprehend how individual brains that map the external reality, almost surely with different particular vector representations, are nevertheless able to communicate and share a common knowledge of the world. We finish presenting adaptive association graphs, an extension of the model that makes use of the tensor product, which provides a solution to the known problem of branching in semantic nets.
Hierarchy of modular graph identities
Energy Technology Data Exchange (ETDEWEB)
D’Hoker, Eric; Kaidi, Justin [Mani L. Bhaumik Institute for Theoretical Physics, Department of Physics and Astronomy,University of California,Los Angeles, CA 90095 (United States)
2016-11-09
The low energy expansion of Type II superstring amplitudes at genus one is organized in terms of modular graph functions associated with Feynman graphs of a conformal scalar field on the torus. In earlier work, surprising identities between two-loop graphs at all weights, and between higher-loop graphs of weights four and five were constructed. In the present paper, these results are generalized in two complementary directions. First, all identities at weight six and all dihedral identities at weight seven are obtained and proven. Whenever the Laurent polynomial at the cusp is available, the form of these identities confirms the pattern by which the vanishing of the Laurent polynomial governs the full modular identity. Second, the family of modular graph functions is extended to include all graphs with derivative couplings and worldsheet fermions. These extended families of modular graph functions are shown to obey a hierarchy of inhomogeneous Laplace eigenvalue equations. The eigenvalues are calculated analytically for the simplest infinite sub-families and obtained by Maple for successively more complicated sub-families. The spectrum is shown to consist solely of eigenvalues s(s−1) for positive integers s bounded by the weight, with multiplicities which exhibit rich representation-theoretic patterns.
Bayesian networks with examples in R
Scutari, Marco
2014-01-01
Introduction. The Discrete Case: Multinomial Bayesian Networks. The Continuous Case: Gaussian Bayesian Networks. More Complex Cases. Theory and Algorithms for Bayesian Networks. Real-World Applications of Bayesian Networks. Appendices. Bibliography.
Application of dynamic uncertain causality graph in spacecraft fault diagnosis: Logic cycle
Yao, Quanying; Zhang, Qin; Liu, Peng; Yang, Ping; Zhu, Ma; Wang, Xiaochen
2017-04-01
Intelligent diagnosis system are applied to fault diagnosis in spacecraft. Dynamic Uncertain Causality Graph (DUCG) is a new probability graphic model with many advantages. In the knowledge expression of spacecraft fault diagnosis, feedback among variables is frequently encountered, which may cause directed cyclic graphs (DCGs). Probabilistic graphical models (PGMs) such as bayesian network (BN) have been widely applied in uncertain causality representation and probabilistic reasoning, but BN does not allow DCGs. In this paper, DUGG is applied to fault diagnosis in spacecraft: introducing the inference algorithm for the DUCG to deal with feedback. Now, DUCG has been tested in 16 typical faults with 100% diagnosis accuracy.
XML Graphs in Program Analysis
DEFF Research Database (Denmark)
Møller, Anders; Schwartzbach, Michael I.
2011-01-01
of XML graphs against different XML schema languages, and provide a software package that enables others to make use of these ideas. We also survey the use of XML graphs for program analysis with four very different languages: XACT (XML in Java), Java Servlets (Web application programming), XSugar......XML graphs have shown to be a simple and effective formalism for representing sets of XML documents in program analysis. It has evolved through a six year period with variants tailored for a range of applications. We present a unified definition, outline the key properties including validation...
Rabern, Landon
2007-01-01
We improve upper bounds on the chromatic number proven independently in \\cite{reedNote} and \\cite{ingo}. Our main lemma gives a sufficient condition for two paths in graph to be completely joined. Using this, we prove that if a graph has an optimal coloring with more than $\\frac{\\omega}{2}$ singleton color classes, then it satisfies $\\chi \\leq \\frac{\\omega + \\Delta + 1}{2}$. It follows that a graph satisfying $n - \\Delta < \\alpha + \\frac{\\omega - 1}{2}$ must also satisfy $\\chi \\leq \\frac{\\ome...
Graphs with Eulerian unit spheres
Knill, Oliver
2015-01-01
d-spheres in graph theory are inductively defined as graphs for which all unit spheres S(x) are (d-1)-spheres and that the removal of one vertex renders the graph contractible. Eulerian d-spheres are geometric d-spheres which are d+1 colorable. We prove here that G is an Eulerian sphere if and only if the degrees of all the (d-2)-dimensional sub-simplices in G are even. This generalizes a Kempe-Heawood result for d=2 and is work related to the conjecture that all d-spheres have chromatic numb...
Bayesian methods in reliability
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
Szabó, György; Fáth, Gábor
2007-07-01
Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first four sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fifth section surveys the topological complications implied by non-mean-field-type social network structures in general. The next three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner's Dilemma, the Rock-Scissors-Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.
CSIR Research Space (South Africa)
Rosman, Benjamin
2016-02-01
Full Text Available Keywords Policy Reuse · Reinforcement Learning · Online Learning · Online Bandits · Transfer Learning · Bayesian Optimisation · Bayesian Decision Theory. 1 Introduction As robots and software agents are becoming more ubiquitous in many applications.... The agent has access to a library of policies (pi1, pi2 and pi3), and has previously experienced a set of task instances (τ1, τ2, τ3, τ4), as well as samples of the utilities of the library policies on these instances (the black dots indicate the means...
Properly colored connectivity of graphs
Li, Xueliang; Qin, Zhongmei
2018-01-01
A comprehensive survey of proper connection of graphs is discussed in this book with real world applications in computer science and network security. Beginning with a brief introduction, comprising relevant definitions and preliminary results, this book moves on to consider a variety of properties of graphs that imply bounds on the proper connection number. Detailed proofs of significant advancements toward open problems and conjectures are presented with complete references. Researchers and graduate students with an interest in graph connectivity and colorings will find this book useful as it builds upon fundamental definitions towards modern innovations, strategies, and techniques. The detailed presentation lends to use as an introduction to proper connection of graphs for new and advanced researchers, a solid book for a graduate level topics course, or as a reference for those interested in expanding and further developing research in the area.
Graph anomalies in cyber communications
Energy Technology Data Exchange (ETDEWEB)
Vander Wiel, Scott A [Los Alamos National Laboratory; Storlie, Curtis B [Los Alamos National Laboratory; Sandine, Gary [Los Alamos National Laboratory; Hagberg, Aric A [Los Alamos National Laboratory; Fisk, Michael [Los Alamos National Laboratory
2011-01-11
Enterprises monitor cyber traffic for viruses, intruders and stolen information. Detection methods look for known signatures of malicious traffic or search for anomalies with respect to a nominal reference model. Traditional anomaly detection focuses on aggregate traffic at central nodes or on user-level monitoring. More recently, however, traffic is being viewed more holistically as a dynamic communication graph. Attention to the graph nature of the traffic has expanded the types of anomalies that are being sought. We give an overview of several cyber data streams collected at Los Alamos National Laboratory and discuss current work in modeling the graph dynamics of traffic over the network. We consider global properties and local properties within the communication graph. A method for monitoring relative entropy on multiple correlated properties is discussed in detail.
Open Graphs and Computational Reasoning
Directory of Open Access Journals (Sweden)
Lucas Dixon
2010-06-01
Full Text Available We present a form of algebraic reasoning for computational objects which are expressed as graphs. Edges describe the flow of data between primitive operations which are represented by vertices. These graphs have an interface made of half-edges (edges which are drawn with an unconnected end and enjoy rich compositional principles by connecting graphs along these half-edges. In particular, this allows equations and rewrite rules to be specified between graphs. Particular computational models can then be encoded as an axiomatic set of such rules. Further rules can be derived graphically and rewriting can be used to simulate the dynamics of a computational system, e.g. evaluating a program on an input. Examples of models which can be formalised in this way include traditional electronic circuits as well as recent categorical accounts of quantum information.
Woeginger, G.J.
1998-01-01
In this short note we argue that the toughness of split graphs can be computed in polynomial time. This solves an open problem from a recent paper by Kratsch et al. (Discrete Math. 150 (1996) 231–245).
Generating random networks and graphs
Coolen, Ton; Roberts, Ekaterina
2017-01-01
This book supports researchers who need to generate random networks, or who are interested in the theoretical study of random graphs. The coverage includes exponential random graphs (where the targeted probability of each network appearing in the ensemble is specified), growth algorithms (i.e. preferential attachment and the stub-joining configuration model), special constructions (e.g. geometric graphs and Watts Strogatz models) and graphs on structured spaces (e.g. multiplex networks). The presentation aims to be a complete starting point, including details of both theory and implementation, as well as discussions of the main strengths and weaknesses of each approach. It includes extensive references for readers wishing to go further. The material is carefully structured to be accessible to researchers from all disciplines while also containing rigorous mathematical analysis (largely based on the techniques of statistical mechanics) to support those wishing to further develop or implement the theory of rand...
Evaluating Mixture Modeling for Clustering: Recommendations and Cautions
Steinley, Douglas; Brusco, Michael J.
2011-01-01
This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…
Graph theory and its applications
Gross, Jonathan L
2006-01-01
Gross and Yellen take a comprehensive approach to graph theory that integrates careful exposition of classical developments with emerging methods, models, and practical needs. Their unparalleled treatment provides a text ideal for a two-semester course and a variety of one-semester classes, from an introductory one-semester course to courses slanted toward classical graph theory, operations research, data structures and algorithms, or algebra and topology.
A faithful functor among algebras and graphs
Falcón Ganfornina, Óscar Jesús; Falcón Ganfornina, Raúl Manuel; Núñez Valdés, Juan; Pacheco Martínez, Ana María; Villar Liñán, María Trinidad; Vigo Aguiar, Jesús (Coordinador)
2016-01-01
The problem of identifying a functor between the categories of algebras and graphs is currently open. Based on a known algorithm that identifies isomorphisms of Latin squares with isomorphism of vertex-colored graphs, we describe here a pair of graphs that enable us to find a faithful functor between finite-dimensional algebras over finite fields and these graphs.
Graphs with branchwidth at most three
Bodlaender, H.L.; Thilikos, D.M.
1997-01-01
In this paper we investigate both the structure of graphs with branchwidth at most three, as well as algorithms to recognise such graphs. We show that a graph has branchwidth at most three, if and only if it has treewidth at most three and does not contain the three-dimensional binary cube graph
A Modal-Logic Based Graph Abstraction
Bauer, J.; Boneva, I.B.; Kurban, M.E.; Rensink, Arend; Ehrig, H; Heckel, R.; Rozenberg, G.; Taentzer, G.
2008-01-01
Infinite or very large state spaces often prohibit the successful verification of graph transformation systems. Abstract graph transformation is an approach that tackles this problem by abstracting graphs to abstract graphs of bounded size and by lifting application of productions to abstract
Graphs whose complement and square are isomorphic
DEFF Research Database (Denmark)
Pedersen, Anders Sune
2014-01-01
We study square-complementary graphs, that is, graphs whose complement and square are isomorphic. We prove several necessary conditions for a graph to be square-complementary, describe ways of building new square-complementary graphs from existing ones, construct infinite families of square-compl...
Acyclicity in edge-colored graphs
DEFF Research Database (Denmark)
Gutin, Gregory; Jones, Mark; Sheng, Bin
2017-01-01
A walk W in edge-colored graphs is called properly colored (PC) if every pair of consecutive edges in W is of different color. We introduce and study five types of PC acyclicity in edge-colored graphs such that graphs of PC acyclicity of type i is a proper superset of graphs of acyclicity of type...
Building Scalable Knowledge Graphs for Earth Science
Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian
2017-01-01
Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.
Port-Hamiltonian Systems on Open Graphs
Schaft, A.J. van der; Maschke, B.M.
2010-01-01
In this talk we discuss how to define in an intrinsic manner port-Hamiltonian dynamics on open graphs. Open graphs are graphs where some of the vertices are boundary vertices (terminals), which allow interconnection with other systems. We show that a directed graph carries two natural Dirac
Constructing Dense Graphs with Unique Hamiltonian Cycles
Lynch, Mark A. M.
2012-01-01
It is not difficult to construct dense graphs containing Hamiltonian cycles, but it is difficult to generate dense graphs that are guaranteed to contain a unique Hamiltonian cycle. This article presents an algorithm for generating arbitrarily large simple graphs containing "unique" Hamiltonian cycles. These graphs can be turned into dense graphs…
Skew-adjacency matrices of graphs
Cavers, M.; Cioaba, S.M.; Fallat, S.; Gregory, D.A.; Haemers, W.H.; Kirkland, S.J.; McDonald, J.J.; Tsatsomeros, M.
2012-01-01
The spectra of the skew-adjacency matrices of a graph are considered as a possible way to distinguish adjacency cospectral graphs. This leads to the following topics: graphs whose skew-adjacency matrices are all cospectral; relations between the matchings polynomial of a graph and the characteristic
Chromatic polynomials of random graphs
International Nuclear Information System (INIS)
Van Bussel, Frank; Fliegner, Denny; Timme, Marc; Ehrlich, Christoph; Stolzenberg, Sebastian
2010-01-01
Chromatic polynomials and related graph invariants are central objects in both graph theory and statistical physics. Computational difficulties, however, have so far restricted studies of such polynomials to graphs that were either very small, very sparse or highly structured. Recent algorithmic advances (Timme et al 2009 New J. Phys. 11 023001) now make it possible to compute chromatic polynomials for moderately sized graphs of arbitrary structure and number of edges. Here we present chromatic polynomials of ensembles of random graphs with up to 30 vertices, over the entire range of edge density. We specifically focus on the locations of the zeros of the polynomial in the complex plane. The results indicate that the chromatic zeros of random graphs have a very consistent layout. In particular, the crossing point, the point at which the chromatic zeros with non-zero imaginary part approach the real axis, scales linearly with the average degree over most of the density range. While the scaling laws obtained are purely empirical, if they continue to hold in general there are significant implications: the crossing points of chromatic zeros in the thermodynamic limit separate systems with zero ground state entropy from systems with positive ground state entropy, the latter an exception to the third law of thermodynamics.
Commuting graphs of matrix algebras
International Nuclear Information System (INIS)
Akbari, S.; Bidkhori, H.; Mohammadian, A.
2006-08-01
The commuting graph of a ring R, denoted by Γ(R), is a graph whose vertices are all non- central elements of R and two distinct vertices x and y are adjacent if and only if xy = yx. The commuting graph of a group G, denoted by Γ(G), is similarly defined. In this paper we investigate some graph theoretic properties of Γ(M n (F)), where F is a field and n ≥ 2. Also we study the commuting graphs of some classical groups such as GL n (F) and SL n (F). We show that Γ(M n (F)) is a connected graph if and only if every field extension of F of degree n contains a proper intermediate field. We prove that apart from finitely many fields, a similar result is true for Γ(GL n (F)) and Γ(SL n (F)). Also we show that for two fields E and F and integers m, n ≥> 2, if Γ(M m (E)) ≅ Γ(M n (F)), then m = n and vertical bar E vertical bar = vertical bar F vertical bar. (author)
Bayesian methods for hackers probabilistic programming and Bayesian inference
Davidson-Pilon, Cameron
2016-01-01
Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples a...
Graph Quasicontinuous Functions and Densely Continuous Forms
Directory of Open Access Journals (Sweden)
Lubica Hola
2017-07-01
Full Text Available Let $X, Y$ be topological spaces. A function $f: X \\to Y$ is said to be graph quasicontinuous if there is a quasicontinuous function $g: X \\to Y$ with the graph of $g$ contained in the closure of the graph of $f$. There is a close relation between the notions of graph quasicontinuous functions and minimal usco maps as well as the notions of graph quasicontinuous functions and densely continuous forms. Every function with values in a compact Hausdorff space is graph quasicontinuous; more generally every locally compact function is graph quasicontinuous.
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Korattikara, A.; Rathod, V.; Murphy, K.; Welling, M.; Cortes, C.; Lawrence, N.D.; Lee, D.D.; Sugiyama, M.; Garnett, R.
2015-01-01
We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), e.g., for applications involving bandits or active learning. One simple
Bayesian Geostatistical Design
DEFF Research Database (Denmark)
Diggle, Peter; Lophaven, Søren Nymand
2006-01-01
locations to, or deletion of locations from, an existing design, and prospective design, which consists of choosing positions for a new set of sampling locations. We propose a Bayesian design criterion which focuses on the goal of efficient spatial prediction whilst allowing for the fact that model...
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
DEFF Research Database (Denmark)
Hartelius, Karsten; Carstensen, Jens Michael
2003-01-01
A method for locating distorted grid structures in images is presented. The method is based on the theories of template matching and Bayesian image restoration. The grid is modeled as a deformable template. Prior knowledge of the grid is described through a Markov random field (MRF) model which r...
Bayesian Independent Component Analysis
DEFF Research Database (Denmark)
Winther, Ole; Petersen, Kaare Brandt
2007-01-01
In this paper we present an empirical Bayesian framework for independent component analysis. The framework provides estimates of the sources, the mixing matrix and the noise parameters, and is flexible with respect to choice of source prior and the number of sources and sensors. Inside the engine...
Bayesian Exponential Smoothing.
Forbes, C.S.; Snyder, R.D.; Shami, R.S.
2000-01-01
In this paper, a Bayesian version of the exponential smoothing method of forecasting is proposed. The approach is based on a state space model containing only a single source of error for each time interval. This model allows us to improve current practices surrounding exponential smoothing by providing both point predictions and measures of the uncertainty surrounding them.
Query Optimizations over Decentralized RDF Graphs
Abdelaziz, Ibrahim
2017-05-18
Applications in life sciences, decentralized social networks, Internet of Things, and statistical linked dataspaces integrate data from multiple decentralized RDF graphs via SPARQL queries. Several approaches have been proposed to optimize query processing over a small number of heterogeneous data sources by utilizing schema information. In the case of schema similarity and interlinks among sources, these approaches cause unnecessary data retrieval and communication, leading to poor scalability and response time. This paper addresses these limitations and presents Lusail, a system for scalable and efficient SPARQL query processing over decentralized graphs. Lusail achieves scalability and low query response time through various optimizations at compile and run times. At compile time, we use a novel locality-aware query decomposition technique that maximizes the number of query triple patterns sent together to a source based on the actual location of the instances satisfying these triple patterns. At run time, we use selectivity-awareness and parallel query execution to reduce network latency and to increase parallelism by delaying the execution of subqueries expected to return large results. We evaluate Lusail using real and synthetic benchmarks, with data sizes up to billions of triples on an in-house cluster and a public cloud. We show that Lusail outperforms state-of-the-art systems by orders of magnitude in terms of scalability and response time.
Aligning Biomolecular Networks Using Modular Graph Kernels
Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant
Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.
Exact WKB analysis and cluster algebras
International Nuclear Information System (INIS)
Iwaki, Kohei; Nakanishi, Tomoki
2014-01-01
We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)
Content-Agnostic Malware Detection in Heterogeneous Malicious Distribution Graph
Alabdulmohsin, Ibrahim
2016-10-26
Malware detection has been widely studied by analysing either file dropping relationships or characteristics of the file distribution network. This paper, for the first time, studies a global heterogeneous malware delivery graph fusing file dropping relationship and the topology of the file distribution network. The integration offers a unique ability of structuring the end-to-end distribution relationship. However, it brings large heterogeneous graphs to analysis. In our study, an average daily generated graph has more than 4 million edges and 2.7 million nodes that differ in type, such as IPs, URLs, and files. We propose a novel Bayesian label propagation model to unify the multi-source information, including content-agnostic features of different node types and topological information of the heterogeneous network. Our approach does not need to examine the source codes nor inspect the dynamic behaviours of a binary. Instead, it estimates the maliciousness of a given file through a semi-supervised label propagation procedure, which has a linear time complexity w.r.t. the number of nodes and edges. The evaluation on 567 million real-world download events validates that our proposed approach efficiently detects malware with a high accuracy. © 2016 Copyright held by the owner/author(s).
Interactive Graph Layout of a Million Nodes
Peng Mi; Maoyuan Sun; Moeti Masiane; Yong Cao; Chris North
2016-01-01
Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially support sensemaking of large graphs. However, designing interactive algorithms to achieve this is challenging. In this paper, we tackle the scalability problem of interactive layout of large graphs, and contribute a new GPU-based force-directed layout algorithm that exploits graph to...
Khovanov homology of graph-links
Energy Technology Data Exchange (ETDEWEB)
Nikonov, Igor M [M. V. Lomonosov Moscow State University, Faculty of Mechanics and Mathematics, Moscow (Russian Federation)
2012-08-31
Graph-links arise as the intersection graphs of turning chord diagrams of links. Speaking informally, graph-links provide a combinatorial description of links up to mutations. Many link invariants can be reformulated in the language of graph-links. Khovanov homology, a well-known and useful knot invariant, is defined for graph-links in this paper (in the case of the ground field of characteristic two). Bibliography: 14 titles.
MadGraph/MadEvent. The new web generation
International Nuclear Information System (INIS)
Alwall, J.
2007-01-01
The new web-based version of the automatized process and event generator MadGraph/MadEvent is now available. Recent developments are: New models, notably MSSM, 2HDM and a framework for addition of user-defined models, inclusive sample generation and on-line hadronization and detector simulation. Event generation can be done on-line on any of our clusters. (author)
PRIVATE GRAPHS – ACCESS RIGHTS ON GRAPHS FOR SEAMLESS NAVIGATION
Directory of Open Access Journals (Sweden)
W. Dorner
2016-06-01
Full Text Available After the success of GNSS (Global Navigational Satellite Systems and navigation services for public streets, indoor seems to be the next big development in navigational services, relying on RTLS – Real Time Locating Services (e.g. WIFI and allowing seamless navigation. In contrast to navigation and routing services on public streets, seamless navigation will cause an additional challenge: how to make routing data accessible to defined users or restrict access rights for defined areas or only to parts of the graph to a defined user group? The paper will present case studies and data from literature, where seamless and especially indoor navigation solutions are presented (hospitals, industrial complexes, building sites, but the problem of restricted access rights was only touched from a real world, but not a technical perspective. The analysis of case studies will show, that the objective of navigation and the different target groups for navigation solutions will demand well defined access rights and require solutions, how to make only parts of a graph to a user or application available to solve a navigational task. The paper will therefore introduce the concept of private graphs, which is defined as a graph for navigational purposes covering the street, road or floor network of an area behind a public street and suggest different approaches how to make graph data for navigational purposes available considering access rights and data protection, privacy and security issues as well.
Bayesian optimization for materials science
Packwood, Daniel
2017-01-01
This book provides a short and concise introduction to Bayesian optimization specifically for experimental and computational materials scientists. After explaining the basic idea behind Bayesian optimization and some applications to materials science in Chapter 1, the mathematical theory of Bayesian optimization is outlined in Chapter 2. Finally, Chapter 3 discusses an application of Bayesian optimization to a complicated structure optimization problem in computational surface science. Bayesian optimization is a promising global optimization technique that originates in the field of machine learning and is starting to gain attention in materials science. For the purpose of materials design, Bayesian optimization can be used to predict new materials with novel properties without extensive screening of candidate materials. For the purpose of computational materials science, Bayesian optimization can be incorporated into first-principles calculations to perform efficient, global structure optimizations. While re...
International Nuclear Information System (INIS)
Khakzad, Nima; Reniers, Genserik
2015-01-01
Dealing with large quantities of flammable and explosive materials, usually at high-pressure high-temperature conditions, makes process plants very vulnerable to cascading effects compared with other infrastructures. The combination of the extremely low frequency of cascading effects and the high complexity and interdependencies of process plants makes risk assessment and vulnerability analysis of process plants very challenging in the context of such events. In the present study, cascading effects were represented as a directed graph; accordingly, the efficacy of a set of graph metrics and measurements was examined in both unit and plant-wide vulnerability analysis of process plants. We demonstrated that vertex-level closeness and betweenness can be used in the unit vulnerability analysis of process plants for the identification of critical units within a process plant. Furthermore, the graph-level closeness metric can be used in the plant-wide vulnerability analysis for the identification of the most vulnerable plant layout with respect to the escalation of cascading effects. Furthermore, the results from the application of the graph metrics have been verified using a Bayesian network methodology. - Highlights: • Graph metrics can effectively be employed to identify vulnerable units and layouts in process plants. • Units with larger vertex-level closeness result in more probable and severe cascading effects. • Units with larger vertex-level betweenness contribute more to the escalation of cascading effects. • Layouts with larger graph-level closeness are more vulnerable to the escalation of cascading effects
Directory of Open Access Journals (Sweden)
S. Venkatesh
2012-01-01
Full Text Available Partial discharge (PD is a major cause of failure of power apparatus and hence its measurement and analysis have emerged as a vital field in assessing the condition of the insulation system. Several efforts have been undertaken by researchers to classify PD pulses utilizing artificial intelligence techniques. Recently, the focus has shifted to the identification of multiple sources of PD since it is often encountered in real-time measurements. Studies have indicated that classification of multi-source PD becomes difficult with the degree of overlap and that several techniques such as mixed Weibull functions, neural networks, and wavelet transformation have been attempted with limited success. Since digital PD acquisition systems record data for a substantial period, the database becomes large, posing considerable difficulties during classification. This research work aims firstly at analyzing aspects concerning classification capability during the discrimination of multisource PD patterns. Secondly, it attempts at extending the previous work of the authors in utilizing the novel approach of probabilistic neural network versions for classifying moderate sets of PD sources to that of large sets. The third focus is on comparing the ability of partition-based algorithms, namely, the labelled (learning vector quantization and unlabelled (K-means versions, with that of a novel hypergraph-based clustering method in providing parsimonious sets of centers during classification.
Towards port sustainability through probabilistic models: Bayesian networks
Directory of Open Access Journals (Sweden)
B. Molina
2018-04-01
Full Text Available It is necessary that a manager of an infrastructure knows relations between variables. Using Bayesian networks, variables can be classified, predicted and diagnosed, being able to estimate posterior probability of the unknown ones based on known ones. The proposed methodology has generated a database with port variables, which have been classified as economic, social, environmental and institutional, as addressed in of smart ports studies made in all Spanish Port System. Network has been developed using an acyclic directed graph, which have let us know relationships in terms of parents and sons. In probabilistic terms, it can be concluded from the constructed network that the most decisive variables for port sustainability are those that are part of the institutional dimension. It has been concluded that Bayesian networks allow modeling uncertainty probabilistically even when the number of variables is high as it occurs in port planning and exploitation.
Clustering of near clusters versus cluster compactness
International Nuclear Information System (INIS)
Yu Gao; Yipeng Jing
1989-01-01
The clustering properties of near Zwicky clusters are studied by using the two-point angular correlation function. The angular correlation functions for compact and medium compact clusters, for open clusters, and for all near Zwicky clusters are estimated. The results show much stronger clustering for compact and medium compact clusters than for open clusters, and that open clusters have nearly the same clustering strength as galaxies. A detailed study of the compactness-dependence of correlation function strength is worth investigating. (author)
Eigenfunction statistics on quantum graphs
International Nuclear Information System (INIS)
Gnutzmann, S.; Keating, J.P.; Piotet, F.
2010-01-01
We investigate the spatial statistics of the energy eigenfunctions on large quantum graphs. It has previously been conjectured that these should be described by a Gaussian Random Wave Model, by analogy with quantum chaotic systems, for which such a model was proposed by Berry in 1977. The autocorrelation functions we calculate for an individual quantum graph exhibit a universal component, which completely determines a Gaussian Random Wave Model, and a system-dependent deviation. This deviation depends on the graph only through its underlying classical dynamics. Classical criteria for quantum universality to be met asymptotically in the large graph limit (i.e. for the non-universal deviation to vanish) are then extracted. We use an exact field theoretic expression in terms of a variant of a supersymmetric σ model. A saddle-point analysis of this expression leads to the estimates. In particular, intensity correlations are used to discuss the possible equidistribution of the energy eigenfunctions in the large graph limit. When equidistribution is asymptotically realized, our theory predicts a rate of convergence that is a significant refinement of previous estimates. The universal and system-dependent components of intensity correlation functions are recovered by means of an exact trace formula which we analyse in the diagonal approximation, drawing in this way a parallel between the field theory and semiclassics. Our results provide the first instance where an asymptotic Gaussian Random Wave Model has been established microscopically for eigenfunctions in a system with no disorder.
Directory of Open Access Journals (Sweden)
Andreas P. Braun
2016-04-01
Full Text Available Box graphs succinctly and comprehensively characterize singular fibers of elliptic fibrations in codimension two and three, as well as flop transitions connecting these, in terms of representation theoretic data. We develop a framework that provides a systematic map between a box graph and a crepant algebraic resolution of the singular elliptic fibration, thus allowing an explicit construction of the fibers from a singular Weierstrass or Tate model. The key tool is what we call a fiber face diagram, which shows the relevant information of a (partial toric triangulation and allows the inclusion of more general algebraic blowups. We shown that each such diagram defines a sequence of weighted algebraic blowups, thus providing a realization of the fiber defined by the box graph in terms of an explicit resolution. We show this correspondence explicitly for the case of SU(5 by providing a map between box graphs and fiber faces, and thereby a sequence of algebraic resolutions of the Tate model, which realizes each of the box graphs.
Degree-based graph construction
International Nuclear Information System (INIS)
Kim, Hyunju; Toroczkai, Zoltan; Erdos, Peter L; Miklos, Istvan; Szekely, Laszlo A
2009-01-01
Degree-based graph construction is a ubiquitous problem in network modelling (Newman et al 2006 The Structure and Dynamics of Networks (Princeton Studies in Complexity) (Princeton, NJ: Princeton University Press), Boccaletti et al 2006 Phys. Rep. 424 175), ranging from social sciences to chemical compounds and biochemical reaction networks in the cell. This problem includes existence, enumeration, exhaustive construction and sampling questions with aspects that are still open today. Here we give necessary and sufficient conditions for a sequence of nonnegative integers to be realized as a simple graph's degree sequence, such that a given (but otherwise arbitrary) set of connections from an arbitrarily given node is avoided. We then use this result to present a swap-free algorithm that builds all simple graphs realizing a given degree sequence. In a wider context, we show that our result provides a greedy construction method to build all the f-factor subgraphs (Tutte 1952 Can. J. Math. 4 314) embedded within K n setmn S k , where K n is the complete graph and S k is a star graph centred on one of the nodes. (fast track communication)
Hierarchical organisation of causal graphs
International Nuclear Information System (INIS)
Dziopa, P.
1993-01-01
This paper deals with the design of a supervision system using a hierarchy of models formed by graphs, in which the variables are the nodes and the causal relations between the variables of the arcs. To obtain a representation of the variables evolutions which contains only the relevant features of their real evolutions, the causal relations are completed with qualitative transfer functions (QTFs) which produce roughly the behaviour of the classical transfer functions. Major improvements have been made in the building of the hierarchical organization. First, the basic variables of the uppermost level and the causal relations between them are chosen. The next graph is built by adding intermediary variables to the upper graph. When the undermost graph has been built, the transfer functions parameters corresponding to its causal relations are identified. The second task consists in the upwelling of the information from the undermost graph to the uppermost one. A fusion procedure of the causal relations has been designed to compute the QFTs relevant for each level. This procedure aims to reduce the number of parameters needed to represent an evolution at a high level of abstraction. These techniques have been applied to the hierarchical modelling of nuclear process. (authors). 8 refs., 12 figs
Integer Flows and Circuit Covers of Graphs and Signed Graphs
Cheng, Jian
The work in Chapter 2 is motivated by Tutte and Jaeger's pioneering work on converting modulo flows into integer-valued flows for ordinary graphs. For a signed graphs (G, sigma), we first prove that for each k ∈ {2, 3}, if (G, sigma) is (k - 1)-edge-connected and contains an even number of negative edges when k = 2, then every modulo k-flow of (G, sigma) can be converted into an integer-valued ( k + 1)-ow with a larger or the same support. We also prove that if (G, sigma) is odd-(2p+1)-edge-connected, then (G, sigma) admits a modulo circular (2 + 1/ p)-flows if and only if it admits an integer-valued circular (2 + 1/p)-flows, which improves all previous result by Xu and Zhang (DM2005), Schubert and Steffen (EJC2015), and Zhu (JCTB2015). Shortest circuit cover conjecture is one of the major open problems in graph theory. It states that every bridgeless graph G contains a set of circuits F such that each edge is contained in at least one member of F and the length of F is at most 7/5∥E(G)∥. This concept was recently generalized to signed graphs by Macajova et al. (JGT2015). In Chapter 3, we improve their upper bound from 11∥E( G)∥ to 14/3 ∥E(G)∥, and if G is 2-edgeconnected and has even negativeness, then it can be further reduced to 11/3 ∥E(G)∥. Tutte's 3-flow conjecture has been studied by many graph theorists in the last several decades. As a new approach to this conjecture, DeVos and Thomassen considered the vectors as ow values and found that there is a close relation between vector S1-flows and integer 3-NZFs. Motivated by their observation, in Chapter 4, we prove that if a graph G admits a vector S1-flow with rank at most two, then G admits an integer 3-NZF. The concept of even factors is highly related to the famous Four Color Theorem. We conclude this dissertation in Chapter 5 with an improvement of a recent result by Chen and Fan (JCTB2016) on the upperbound of even factors. We show that if a graph G contains an even factor, then it
A comparison between fault tree analysis and reliability graph with general gates
International Nuclear Information System (INIS)
Kim, Man Cheol; Seong, Poong Hyun; Jung, Woo Sik
2004-01-01
Currently, level-1 probabilistic safety assessment (PSA) is performed on the basis of event tree analysis and fault tree analysis. Kim and Seong developed a new method for system reliability analysis named reliability graph with general gates (RGGG). The RGGG is an extension of conventional reliability graph, and it utilizes the transformation of system structures to equivalent Bayesian networks for quantitative calculation. The RGGG is considered to be intuitive and easy-to-use while as powerful as fault tree analysis. As an example, Kim and Seong already showed that the Bayesian network model for digital plant protection system (DPPS), which is transformed from the RGGG model for DPPS, can be shown in 1 page, while the fault tree model for DPPS consists of 64 pages of fault trees. Kim and Seong also insisted that Bayesian network model for DPPS is more intuitive because the one-to-one matching between each node in the Bayesian network model and an actual component of DPPS is possible. In this paper, we are going to give a comparison between fault tree analysis and the RGGG method with two example systems. The two example systems are the recirculation of in Korean standard nuclear power plants (KSNP) and the fault tree model developed by Rauzy
Bayesian Estimation of the Kumaraswamy InverseWeibull Distribution
Directory of Open Access Journals (Sweden)
Felipe R.S. de Gusmao
2017-05-01
Full Text Available The Kumaraswamy InverseWeibull distribution has the ability to model failure rates that have unimodal shapes and are quite common in reliability and biological studies. The three-parameter Kumaraswamy InverseWeibull distribution with decreasing and unimodal failure rate is introduced. We provide a comprehensive treatment of the mathematical properties of the Kumaraswany Inverse Weibull distribution and derive expressions for its moment generating function and the ligrl/ig-th generalized moment. Some properties of the model with some graphs of density and hazard function are discussed. We also discuss a Bayesian approach for this distribution and an application was made for a real data set.
Probability and Bayesian statistics
1987-01-01
This book contains selected and refereed contributions to the "Inter national Symposium on Probability and Bayesian Statistics" which was orga nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...
Graph Theory and Ion and Molecular Aggregation in Aqueous Solutions
Choi, Jun-Ho; Lee, Hochan; Choi, Hyung Ran; Cho, Minhaeng
2018-04-01
In molecular and cellular biology, dissolved ions and molecules have decisive effects on chemical and biological reactions, conformational stabilities, and functions of small to large biomolecules. Despite major efforts, the current state of understanding of the effects of specific ions, osmolytes, and bioprotecting sugars on the structure and dynamics of water H-bonding networks and proteins is not yet satisfactory. Recently, to gain deeper insight into this subject, we studied various aggregation processes of ions and molecules in high-concentration salt, osmolyte, and sugar solutions with time-resolved vibrational spectroscopy and molecular dynamics simulation methods. It turns out that ions (or solute molecules) have a strong propensity to self-assemble into large and polydisperse aggregates that affect both local and long-range water H-bonding structures. In particular, we have shown that graph-theoretical approaches can be used to elucidate morphological characteristics of large aggregates in various aqueous salt, osmolyte, and sugar solutions. When ion and molecular aggregates in such aqueous solutions are treated as graphs, a variety of graph-theoretical properties, such as graph spectrum, degree distribution, clustering coefficient, minimum path length, and graph entropy, can be directly calculated by considering an ensemble of configurations taken from molecular dynamics trajectories. Here we show percolating behavior exhibited by ion and molecular aggregates upon increase in solute concentration in high solute concentrations and discuss compelling evidence of the isomorphic relation between percolation transitions of ion and molecular aggregates and water H-bonding networks. We anticipate that the combination of graph theory and molecular dynamics simulation methods will be of exceptional use in achieving a deeper understanding of the fundamental physical chemistry of dissolution and in describing the interplay between the self-aggregation of solute
Graph Theory and Ion and Molecular Aggregation in Aqueous Solutions.
Choi, Jun-Ho; Lee, Hochan; Choi, Hyung Ran; Cho, Minhaeng
2018-04-20
In molecular and cellular biology, dissolved ions and molecules have decisive effects on chemical and biological reactions, conformational stabilities, and functions of small to large biomolecules. Despite major efforts, the current state of understanding of the effects of specific ions, osmolytes, and bioprotecting sugars on the structure and dynamics of water H-bonding networks and proteins is not yet satisfactory. Recently, to gain deeper insight into this subject, we studied various aggregation processes of ions and molecules in high-concentration salt, osmolyte, and sugar solutions with time-resolved vibrational spectroscopy and molecular dynamics simulation methods. It turns out that ions (or solute molecules) have a strong propensity to self-assemble into large and polydisperse aggregates that affect both local and long-range water H-bonding structures. In particular, we have shown that graph-theoretical approaches can be used to elucidate morphological characteristics of large aggregates in various aqueous salt, osmolyte, and sugar solutions. When ion and molecular aggregates in such aqueous solutions are treated as graphs, a variety of graph-theoretical properties, such as graph spectrum, degree distribution, clustering coefficient, minimum path length, and graph entropy, can be directly calculated by considering an ensemble of configurations taken from molecular dynamics trajectories. Here we show percolating behavior exhibited by ion and molecular aggregates upon increase in solute concentration in high solute concentrations and discuss compelling evidence of the isomorphic relation between percolation transitions of ion and molecular aggregates and water H-bonding networks. We anticipate that the combination of graph theory and molecular dynamics simulation methods will be of exceptional use in achieving a deeper understanding of the fundamental physical chemistry of dissolution and in describing the interplay between the self-aggregation of solute
Approximate Bayesian recursive estimation
Czech Academy of Sciences Publication Activity Database
Kárný, Miroslav
2014-01-01
Roč. 285, č. 1 (2014), s. 100-111 ISSN 0020-0255 R&D Projects: GA ČR GA13-13502S Institutional support: RVO:67985556 Keywords : Approximate parameter estimation * Bayesian recursive estimation * Kullback–Leibler divergence * Forgetting Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.038, year: 2014 http://library.utia.cas.cz/separaty/2014/AS/karny-0425539.pdf
DEFF Research Database (Denmark)
Antoniou, Constantinos; Harrison, Glenn W.; Lau, Morten I.
2015-01-01
A large literature suggests that many individuals do not apply Bayes’ Rule when making decisions that depend on them correctly pooling prior information and sample data. We replicate and extend a classic experimental study of Bayesian updating from psychology, employing the methods of experimenta...... economics, with careful controls for the confounding effects of risk aversion. Our results show that risk aversion significantly alters inferences on deviations from Bayes’ Rule....
Energy Technology Data Exchange (ETDEWEB)
Andrews, Stephen A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Sigeti, David E. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-11-15
These are a set of slides about Bayesian hypothesis testing, where many hypotheses are tested. The conclusions are the following: The value of the Bayes factor obtained when using the median of the posterior marginal is almost the minimum value of the Bayes factor. The value of τ^{2} which minimizes the Bayes factor is a reasonable choice for this parameter. This allows a likelihood ratio to be computed with is the least favorable to H_{0}.
Introduction to Bayesian statistics
Koch, Karl-Rudolf
2007-01-01
This book presents Bayes' theorem, the estimation of unknown parameters, the determination of confidence regions and the derivation of tests of hypotheses for the unknown parameters. It does so in a simple manner that is easy to comprehend. The book compares traditional and Bayesian methods with the rules of probability presented in a logical way allowing an intuitive understanding of random variables and their probability distributions to be formed.
Graph modeling systems and methods
Neergaard, Mike
2015-10-13
An apparatus and a method for vulnerability and reliability modeling are provided. The method generally includes constructing a graph model of a physical network using a computer, the graph model including a plurality of terminating vertices to represent nodes in the physical network, a plurality of edges to represent transmission paths in the physical network, and a non-terminating vertex to represent a non-nodal vulnerability along a transmission path in the physical network. The method additionally includes evaluating the vulnerability and reliability of the physical network using the constructed graph model, wherein the vulnerability and reliability evaluation includes a determination of whether each terminating and non-terminating vertex represents a critical point of failure. The method can be utilized to evaluate wide variety of networks, including power grid infrastructures, communication network topologies, and fluid distribution systems.
Feder, Tomá s; Motwani, Rajeev
2009-01-01
Results on graph turnpike problem without distinctness, including its NP-completeness, and an O(m+n log n) algorithm, is presented. The usual turnpike problem has all pairwise distances given, but does not specify which pair of vertices w e corresponds to. There are two other problems that can be viewed as special cases of the graph turnpike problem, including the bandwidth problem and the low-distortion graph embedding problem. The aim for the turnpike problem in the NP-complete is to orient the edges with weights w i in either direction so that when the whole cycle is transversed in the real line, it returns to a chosen starting point for the cycle. An instance of the turnpike problem with or without distinctness is uniquely mappable if there exists at most one solution up to translation and choice of orientation.
Feder, Tomás
2009-06-01
Results on graph turnpike problem without distinctness, including its NP-completeness, and an O(m+n log n) algorithm, is presented. The usual turnpike problem has all pairwise distances given, but does not specify which pair of vertices w e corresponds to. There are two other problems that can be viewed as special cases of the graph turnpike problem, including the bandwidth problem and the low-distortion graph embedding problem. The aim for the turnpike problem in the NP-complete is to orient the edges with weights w i in either direction so that when the whole cycle is transversed in the real line, it returns to a chosen starting point for the cycle. An instance of the turnpike problem with or without distinctness is uniquely mappable if there exists at most one solution up to translation and choice of orientation.
Negation switching invariant signed graphs
Directory of Open Access Journals (Sweden)
Deepa Sinha
2014-04-01
Full Text Available A signed graph (or, $sigraph$ in short is a graph G in which each edge x carries a value $\\sigma(x \\in \\{-, +\\}$ called its sign. Given a sigraph S, the negation $\\eta(S$ of the sigraph S is a sigraph obtained from S by reversing the sign of every edge of S. Two sigraphs $S_{1}$ and $S_{2}$ on the same underlying graph are switching equivalent if it is possible to assign signs `+' (`plus' or `-' (`minus' to vertices of $S_{1}$ such that by reversing the sign of each of its edges that has received opposite signs at its ends, one obtains $S_{2}$. In this paper, we characterize sigraphs which are negation switching invariant and also see for what sigraphs, S and $\\eta (S$ are signed isomorphic.
International Nuclear Information System (INIS)
Rosmanis, Ansis
2011-01-01
I introduce a continuous-time quantum walk on graphs called the quantum snake walk, the basis states of which are fixed-length paths (snakes) in the underlying graph. First, I analyze the quantum snake walk on the line, and I show that, even though most states stay localized throughout the evolution, there are specific states that most likely move on the line as wave packets with momentum inversely proportional to the length of the snake. Next, I discuss how an algorithm based on the quantum snake walk might potentially be able to solve an extended version of the glued trees problem, which asks to find a path connecting both roots of the glued trees graph. To the best of my knowledge, no efficient quantum algorithm solving this problem is known yet.
Accurate phenotyping: Reconciling approaches through Bayesian model averaging.
Directory of Open Access Journals (Sweden)
Carla Chia-Ming Chen
Full Text Available Genetic research into complex diseases is frequently hindered by a lack of clear biomarkers for phenotype ascertainment. Phenotypes for such diseases are often identified on the basis of clinically defined criteria; however such criteria may not be suitable for understanding the genetic composition of the diseases. Various statistical approaches have been proposed for phenotype definition; however our previous studies have shown that differences in phenotypes estimated using different approaches have substantial impact on subsequent analyses. Instead of obtaining results based upon a single model, we propose a new method, using Bayesian model averaging to overcome problems associated with phenotype definition. Although Bayesian model averaging has been used in other fields of research, this is the first study that uses Bayesian model averaging to reconcile phenotypes obtained using multiple models. We illustrate the new method by applying it to simulated genetic and phenotypic data for Kofendred personality disorder-an imaginary disease with several sub-types. Two separate statistical methods were used to identify clusters of individuals with distinct phenotypes: latent class analysis and grade of membership. Bayesian model averaging was then used to combine the two clusterings for the purpose of subsequent linkage analyses. We found that causative genetic loci for the disease produced higher LOD scores using model averaging than under either individual model separately. We attribute this improvement to consolidation of the cores of phenotype clusters identified using each individual method.
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
Bipartite Graphs as Models of Population Structures in Evolutionary Multiplayer Games
Peña, Jorge; Rochat, Yannick
2012-01-01
By combining evolutionary game theory and graph theory, “games on graphs” study the evolutionary dynamics of frequency-dependent selection in population structures modeled as geographical or social networks. Networks are usually represented by means of unipartite graphs, and social interactions by two-person games such as the famous prisoner’s dilemma. Unipartite graphs have also been used for modeling interactions going beyond pairwise interactions. In this paper, we argue that bipartite graphs are a better alternative to unipartite graphs for describing population structures in evolutionary multiplayer games. To illustrate this point, we make use of bipartite graphs to investigate, by means of computer simulations, the evolution of cooperation under the conventional and the distributed N-person prisoner’s dilemma. We show that several implicit assumptions arising from the standard approach based on unipartite graphs (such as the definition of replacement neighborhoods, the intertwining of individual and group diversity, and the large overlap of interaction neighborhoods) can have a large impact on the resulting evolutionary dynamics. Our work provides a clear example of the importance of construction procedures in games on graphs, of the suitability of bigraphs and hypergraphs for computational modeling, and of the importance of concepts from social network analysis such as centrality, centralization and bipartite clustering for the understanding of dynamical processes occurring on networked population structures. PMID:22970237
On Graph Rewriting, Reduction and Evaluation
DEFF Research Database (Denmark)
Zerny, Ian
2010-01-01
We inter-derive two prototypical styles of graph reduction: reduction machines à la Turner and graph rewriting systems à la Barendregt et al. To this end, we adapt Danvy et al.'s mechanical program derivations from the world of terms to the world of graphs. We also outline how to inter-derive a t......We inter-derive two prototypical styles of graph reduction: reduction machines à la Turner and graph rewriting systems à la Barendregt et al. To this end, we adapt Danvy et al.'s mechanical program derivations from the world of terms to the world of graphs. We also outline how to inter...
The fascinating world of graph theory
Benjamin, Arthur; Zhang, Ping
2015-01-01
Graph theory goes back several centuries and revolves around the study of graphs-mathematical structures showing relations between objects. With applications in biology, computer science, transportation science, and other areas, graph theory encompasses some of the most beautiful formulas in mathematics-and some of its most famous problems. The Fascinating World of Graph Theory explores the questions and puzzles that have been studied, and often solved, through graph theory. This book looks at graph theory's development and the vibrant individuals responsible for the field's growth. Introducin
Graph-based modelling in engineering
Rysiński, Jacek
2017-01-01
This book presents versatile, modern and creative applications of graph theory in mechanical engineering, robotics and computer networks. Topics related to mechanical engineering include e.g. machine and mechanism science, mechatronics, robotics, gearing and transmissions, design theory and production processes. The graphs treated are simple graphs, weighted and mixed graphs, bond graphs, Petri nets, logical trees etc. The authors represent several countries in Europe and America, and their contributions show how different, elegant, useful and fruitful the utilization of graphs in modelling of engineering systems can be. .
XML Graphs in Program Analysis
DEFF Research Database (Denmark)
Møller, Anders; Schwartzbach, Michael Ignatieff
2007-01-01
XML graphs have shown to be a simple and effective formalism for representing sets of XML documents in program analysis. It has evolved through a six year period with variants tailored for a range of applications. We present a unified definition, outline the key properties including validation...... of XML graphs against different XML schema languages, and provide a software package that enables others to make use of these ideas. We also survey four very different applications: XML in Java, Java Servlets and JSP, transformations between XML and non-XML data, and XSLT....
Graph topologies on closed multifunctions
Directory of Open Access Journals (Sweden)
Giuseppe Di Maio
2003-10-01
Full Text Available In this paper we study function space topologies on closed multifunctions, i.e. closed relations on X x Y using various hypertopologies. The hypertopologies are in essence, graph topologies i.e topologies on functions considered as graphs which are subsets of X x Y . We also study several topologies, including one that is derived from the Attouch-Wets filter on the range. We state embedding theorems which enable us to generalize and prove some recent results in the literature with the use of known results in the hyperspace of the range space and in the function space topologies of ordinary functions.
Cyclic graphs and Apery's theorem
International Nuclear Information System (INIS)
Sorokin, V N
2002-01-01
This is a survey of results about the behaviour of Hermite-Pade approximants for graphs of Markov functions, and a survey of interpolation problems leading to Apery's result about the irrationality of the value ζ(3) of the Riemann zeta function. The first example is given of a cyclic graph for which the Hermite-Pade problem leads to Apery's theorem. Explicit formulae for solutions are obtained, namely, Rodrigues' formulae and integral representations. The asymptotic behaviour of the approximants is studied, and recurrence formulae are found
Interacting particle systems on graphs
Sood, Vishal
In this dissertation, the dynamics of socially or biologically interacting populations are investigated. The individual members of the population are treated as particles that interact via links on a social or biological network represented as a graph. The effect of the structure of the graph on the properties of the interacting particle system is studied using statistical physics techniques. In the first chapter, the central concepts of graph theory and social and biological networks are presented. Next, interacting particle systems that are drawn from physics, mathematics and biology are discussed in the second chapter. In the third chapter, the random walk on a graph is studied. The mean time for a random walk to traverse between two arbitrary sites of a random graph is evaluated. Using an effective medium approximation it is found that the mean first-passage time between pairs of sites, as well as all moments of this first-passage time, are insensitive to the density of links in the graph. The inverse of the mean-first passage time varies non-monotonically with the density of links near the percolation transition of the random graph. Much of the behavior can be understood by simple heuristic arguments. Evolutionary dynamics, by which mutants overspread an otherwise uniform population on heterogeneous graphs, are studied in the fourth chapter. Such a process underlies' epidemic propagation, emergence of fads, social cooperation or invasion of an ecological niche by a new species. The first part of this chapter is devoted to neutral dynamics, in which the mutant genotype does not have a selective advantage over the resident genotype. The time to extinction of one of the two genotypes is derived. In the second part of this chapter, selective advantage or fitness is introduced such that the mutant genotype has a higher birth rate or a lower death rate. This selective advantage leads to a dynamical competition in which selection dominates for large populations
Directory of Open Access Journals (Sweden)
Moschopoulos Charalampos
2011-06-01
Full Text Available Abstract Background Recent technological advances applied to biology such as yeast-two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of protein interaction networks. These interaction networks represent a rich, yet noisy, source of data that could be used to extract meaningful information, such as protein complexes. Several interaction network weighting schemes have been proposed so far in the literature in order to eliminate the noise inherent in interactome data. In this paper, we propose a novel weighting scheme and apply it to the S. cerevisiae interactome. Complex prediction rates are improved by up to 39%, depending on the clustering algorithm applied. Results We adopt a two step procedure. During the first step, by applying both novel and well established protein-protein interaction (PPI weighting methods, weights are introduced to the original interactome graph based on the confidence level that a given interaction is a true-positive one. The second step applies clustering using established algorithms in the field of graph theory, as well as two variations of Spectral clustering. The clustered interactome networks are also cross-validated against the confirmed protein complexes present in the MIPS database. Conclusions The results of our experimental work demonstrate that interactome graph weighting methods clearly improve the clustering results of several clustering algorithms. Moreover, our proposed weighting scheme outperforms other approaches of PPI graph weighting.
A Dynamic Bayesian Network Model for the Production and Inventory Control
Shin, Ji-Sun; Takazaki, Noriyuki; Lee, Tae-Hong; Kim, Jin-Il; Lee, Hee-Hyol
In general, the production quantities and delivered goods are changed randomly and then the total stock is also changed randomly. This paper deals with the production and inventory control using the Dynamic Bayesian Network. Bayesian Network is a probabilistic model which represents the qualitative dependence between two or more random variables by the graph structure, and indicates the quantitative relations between individual variables by the conditional probability. The probabilistic distribution of the total stock is calculated through the propagation of the probability on the network. Moreover, an adjusting rule of the production quantities to maintain the probability of a lower limit and a ceiling of the total stock to certain values is shown.
Multiple graph regularized protein domain ranking
Wang, Jim Jing-Yan
2012-11-19
Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.
Multiple graph regularized protein domain ranking
Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin
2012-01-01
Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.
Multiple graph regularized protein domain ranking.
Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin
2012-11-19
Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Multiple graph regularized protein domain ranking
Directory of Open Access Journals (Sweden)
Wang Jim
2012-11-01
Full Text Available Abstract Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Bayesian analysis in plant pathology.
Mila, A L; Carriquiry, A L
2004-09-01
ABSTRACT Bayesian methods are currently much discussed and applied in several disciplines from molecular biology to engineering. Bayesian inference is the process of fitting a probability model to a set of data and summarizing the results via probability distributions on the parameters of the model and unobserved quantities such as predictions for new observations. In this paper, after a short introduction of Bayesian inference, we present the basic features of Bayesian methodology using examples from sequencing genomic fragments and analyzing microarray gene-expressing levels, reconstructing disease maps, and designing experiments.
Refining intra-protein contact prediction by graph analysis
Directory of Open Access Journals (Sweden)
Eyal Eran
2007-05-01
Full Text Available Abstract Background Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence. Results We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%. Conclusion Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.
Multiple Kernel Learning for adaptive graph regularized nonnegative matrix factorization
Wang, Jim Jing-Yan; AbdulJabbar, Mustafa Abdulmajeed
2012-01-01
Nonnegative Matrix Factorization (NMF) has been continuously evolving in several areas like pattern recognition and information retrieval methods. It factorizes a matrix into a product of 2 low-rank non-negative matrices that will define parts-based, and linear representation of non-negative data. Recently, Graph regularized NMF (GrNMF) is proposed to find a compact representation, which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure. In GNMF, an affinity graph is constructed from the original data space to encode the geometrical information. In this paper, we propose a novel idea which engages a Multiple Kernel Learning approach into refining the graph structure that reflects the factorization of the matrix and the new data space. The GrNMF is improved by utilizing the graph refined by the kernel learning, and then a novel kernel learning method is introduced under the GrNMF framework. Our approach shows encouraging results of the proposed algorithm in comparison to the state-of-the-art clustering algorithms like NMF, GrNMF, SVD etc.
Constructing Knowledge Graphs of Depression
Huang, Zhisheng; Yang, Jie; van Harmelen, Frank; Hu, Qing
2017-01-01
Knowledge Graphs have been shown to be useful tools for integrating multiple medical knowledge sources, and to support such tasks as medical decision making, literature retrieval, determining healthcare quality indicators, co-morbodity analysis and many others. A large number of medical knowledge
Partitioning graphs into connected parts
Hof, van 't P.; Paulusma, D.; Woeginger, G.J.; Frid, A.; Morozov, A.S.; Rybalchenko, A.; Wagner, K.W.
2009-01-01
The 2-DISJOINT CONNECTED SUBGRAPHS problem asks if a given graph has two vertex-disjoint connected subgraphs containing pre-specified sets of vertices. We show that this problem is NP-complete even if one of the sets has cardinality 2. The LONGEST PATH CONTRACTIBILITY problem asks for the largest
Isoperimetric inequalities for minimal graphs
International Nuclear Information System (INIS)
Pacelli Bessa, G.; Montenegro, J.F.
2007-09-01
Based on Markvorsen and Palmer's work on mean time exit and isoperimetric inequalities we establish slightly better isoperimetric inequalities and mean time exit estimates for minimal graphs in N x R. We also prove isoperimetric inequalities for submanifolds of Hadamard spaces with tamed second fundamental form. (author)
Ancestral Genres of Mathematical Graphs
Gerofsky, Susan
2011-01-01
Drawing from sources in gesture studies, cognitive science, the anthropology of religion and art/architecture history, this article explores cultural, bodily and cosmological resonances carried (unintentionally) by mathematical graphs on Cartesian coordinates. Concepts of asymmetric bodily spaces, grids, orthogonality, mapping and sacred spaces…
Humidity Graphs for All Seasons.
Esmael, F.
1982-01-01
In a previous article in this journal (Vol. 17, p358, 1979), a wet-bulb depression table was recommended for two simple experiments to determine relative humidity. However, the use of a graph is suggested because it gives the relative humidity directly from the wet and dry bulb readings. (JN)
Contracting a planar graph efficiently
DEFF Research Database (Denmark)
Holm, Jacob; Italiano, Giuseppe F.; Karczmarz, Adam
2017-01-01
the data structure, we can achieve optimal running times for decremental bridge detection, 2-edge connectivity, maximal 3-edge connected components, and the problem of finding a unique perfect matching for a static planar graph. Furthermore, we improve the running times of algorithms for several planar...
Graph Model Based Indoor Tracking
DEFF Research Database (Denmark)
Jensen, Christian Søndergaard; Lu, Hua; Yang, Bin
2009-01-01
The tracking of the locations of moving objects in large indoor spaces is important, as it enables a range of applications related to, e.g., security and indoor navigation and guidance. This paper presents a graph model based approach to indoor tracking that offers a uniform data management...
A graph with fractional revival
Bernard, Pierre-Antoine; Chan, Ada; Loranger, Érika; Tamon, Christino; Vinet, Luc
2018-02-01
An example of a graph that admits balanced fractional revival between antipodes is presented. It is obtained by establishing the correspondence between the quantum walk on a hypercube where the opposite vertices across the diagonals of each face are connected and, the coherent transport of single excitations in the extension of the Krawtchouk spin chain with next-to-nearest neighbour interactions.
Fixation Time for Evolutionary Graphs
Nie, Pu-Yan; Zhang, Pei-Ai
Evolutionary graph theory (EGT) is recently proposed by Lieberman et al. in 2005. EGT is successful for explaining biological evolution and some social phenomena. It is extremely important to consider the time of fixation for EGT in many practical problems, including evolutionary theory and the evolution of cooperation. This study characterizes the time to asymptotically reach fixation.
SpectralNET – an application for spectral graph analysis and visualization
Directory of Open Access Journals (Sweden)
Schreiber Stuart L
2005-10-01
Full Text Available Abstract Background Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics. Networks of genes, proteins, small molecules, or other objects of study can be represented as graphs of nodes (vertices and interactions (edges that can carry different weights. SpectralNET is a flexible application for analyzing and visualizing these biological and chemical networks. Results Available both as a standalone .NET executable and as an ASP.NET web application, SpectralNET was designed specifically with the analysis of graph-theoretic metrics in mind, a computational task not easily accessible using currently available applications. Users can choose either to upload a network for analysis using a variety of input formats, or to have SpectralNET generate an idealized random network for comparison to a real-world dataset. Whichever graph-generation method is used, SpectralNET displays detailed information about each connected component of the graph, including graphs of degree distribution, clustering coefficient by degree, and average distance by degree. In addition, extensive information about the selected vertex is shown, including degree, clustering coefficient, various distance metrics, and the corresponding components of the adjacency, Laplacian, and normalized Laplacian eigenvectors. SpectralNET also displays several graph visualizations, including a linear dimensionality reduction for uploaded datasets (Principal Components Analysis and a non-linear dimensionality reduction that provides an elegant view of global graph structure (Laplacian eigenvectors. Conclusion SpectralNET provides an easily accessible means of analyzing graph-theoretic metrics for data modeling and dimensionality reduction. SpectralNET is publicly available as both a .NET application and an ASP.NET web application from http://chembank.broad.harvard.edu/resources/. Source code is
Coloring sums of extensions of certain graphs
Directory of Open Access Journals (Sweden)
Johan Kok
2017-12-01
Full Text Available We recall that the minimum number of colors that allow a proper coloring of graph $G$ is called the chromatic number of $G$ and denoted $\\chi(G$. Motivated by the introduction of the concept of the $b$-chromatic sum of a graph the concept of $\\chi'$-chromatic sum and $\\chi^+$-chromatic sum are introduced in this paper. The extended graph $G^x$ of a graph $G$ was recently introduced for certain regular graphs. This paper furthers the concepts of $\\chi'$-chromatic sum and $\\chi^+$-chromatic sum to extended paths and cycles. Bipartite graphs also receive some attention. The paper concludes with patterned structured graphs. These last said graphs are typically found in chemical and biological structures.
Mathematical Minute: Rotating a Function Graph
Bravo, Daniel; Fera, Joseph
2013-01-01
Using calculus only, we find the angles you can rotate the graph of a differentiable function about the origin and still obtain a function graph. We then apply the solution to odd and even degree polynomials.
Towards a theory of geometric graphs
Pach, Janos
2004-01-01
The early development of graph theory was heavily motivated and influenced by topological and geometric themes, such as the Konigsberg Bridge Problem, Euler's Polyhedral Formula, or Kuratowski's characterization of planar graphs. In 1936, when Denes Konig published his classical Theory of Finite and Infinite Graphs, the first book ever written on the subject, he stressed this connection by adding the subtitle Combinatorial Topology of Systems of Segments. He wanted to emphasize that the subject of his investigations was very concrete: planar figures consisting of points connected by straight-line segments. However, in the second half of the twentieth century, graph theoretical research took an interesting turn. In the most popular and most rapidly growing areas (the theory of random graphs, Ramsey theory, extremal graph theory, algebraic graph theory, etc.), graphs were considered as abstract binary relations rather than geometric objects. Many of the powerful techniques developed in these fields have been su...
Bounds on Gromov hyperbolicity constant in graphs
Indian Academy of Sciences (India)
Infinite graphs; Cartesian product graphs; independence number; domin- ation number; geodesics ... the secure transmission of information through the internet (see [15, 16]). In particular, ..... In particular, δ(G) is an integer multiple of 1/4.
Summary: beyond fault trees to fault graphs
International Nuclear Information System (INIS)
Alesso, H.P.; Prassinos, P.; Smith, C.F.
1984-09-01
Fault Graphs are the natural evolutionary step over a traditional fault-tree model. A Fault Graph is a failure-oriented directed graph with logic connectives that allows cycles. We intentionally construct the Fault Graph to trace the piping and instrumentation drawing (P and ID) of the system, but with logical AND and OR conditions added. Then we evaluate the Fault Graph with computer codes based on graph-theoretic methods. Fault Graph computer codes are based on graph concepts, such as path set (a set of nodes traveled on a path from one node to another) and reachability (the complete set of all possible paths between any two nodes). These codes are used to find the cut-sets (any minimal set of component failures that will fail the system) and to evaluate the system reliability
Torsional rigidity, isospectrality and quantum graphs
International Nuclear Information System (INIS)
Colladay, Don; McDonald, Patrick; Kaganovskiy, Leon
2017-01-01
We study torsional rigidity for graph and quantum graph analogs of well-known pairs of isospectral non-isometric planar domains. We prove that such isospectral pairs are distinguished by torsional rigidity. (paper)
Bond graph modeling of centrifugal compression systems
Uddin, Nur; Gravdahl, Jan Tommy
2015-01-01
A novel approach to model unsteady fluid dynamics in a compressor network by using a bond graph is presented. The model is intended in particular for compressor control system development. First, we develop a bond graph model of a single compression system. Bond graph modeling offers a different perspective to previous work by modeling the compression system based on energy flow instead of fluid dynamics. Analyzing the bond graph model explains the energy flow during compressor surge. Two pri...
A Graph Calculus for Predicate Logic
Directory of Open Access Journals (Sweden)
Paulo A. S. Veloso
2013-03-01
Full Text Available We introduce a refutation graph calculus for classical first-order predicate logic, which is an extension of previous ones for binary relations. One reduces logical consequence to establishing that a constructed graph has empty extension, i. e. it represents bottom. Our calculus establishes that a graph has empty extension by converting it to a normal form, which is expanded to other graphs until we can recognize conflicting situations (equivalent to a formula and its negation.
Sphere and dot product representations of graphs
R.J. Kang (Ross); T. Müller (Tobias)
2012-01-01
textabstractA graph $G$ is a $k$-sphere graph if there are $k$-dimensional real vectors $v_1,\\dots,v_n$ such that $ij\\in E(G)$ if and only if the distance between $v_i$ and $v_j$ is at most $1$. A graph $G$ is a $k$-dot product graph if there are $k$-dimensional real vectors $v_1,\\dots,v_n$ such
Deep Learning with Dynamic Computation Graphs
Looks, Moshe; Herreshoff, Marcello; Hutchins, DeLesley; Norvig, Peter
2017-01-01
Neural networks that compute over graph structures are a natural fit for problems in a variety of domains, including natural language (parse trees) and cheminformatics (molecular graphs). However, since the computation graph has a different shape and size for every input, such networks do not directly support batched training or inference. They are also difficult to implement in popular deep learning libraries, which are based on static data-flow graphs. We introduce a technique called dynami...
Constructs for Programming with Graph Rewrites
Rodgers, Peter
2000-01-01
Graph rewriting is becoming increasingly popular as a method for programming with graph based data structures. We present several modifications to a basic serial graph rewriting paradigm and discuss how they improve coding programs in the Grrr graph rewriting programming language. The constructs we present are once only nodes, attractor nodes and single match rewrites. We illustrate the operation of the constructs by example. The advantages of adding these new rewrite modifiers is to reduce t...
Graph Regularized Auto-Encoders for Image Representation.
Yiyi Liao; Yue Wang; Yong Liu
2017-06-01
Image representation has been intensively explored in the domain of computer vision for its significant influence on the relative tasks such as image clustering and classification. It is valuable to learn a low-dimensional representation of an image which preserves its inherent information from the original image space. At the perspective of manifold learning, this is implemented with the local invariant idea to capture the intrinsic low-dimensional manifold embedded in the high-dimensional input space. Inspired by the recent successes of deep architectures, we propose a local invariant deep nonlinear mapping algorithm, called graph regularized auto-encoder (GAE). With the graph regularization, the proposed method preserves the local connectivity from the original image space to the representation space, while the stacked auto-encoders provide explicit encoding model for fast inference and powerful expressive capacity for complex modeling. Theoretical analysis shows that the graph regularizer penalizes the weighted Frobenius norm of the Jacobian matrix of the encoder mapping, where the weight matrix captures the local property in the input space. Furthermore, the underlying effects on the hidden representation space are revealed, providing insightful explanation to the advantage of the proposed method. Finally, the experimental results on both clustering and classification tasks demonstrate the effectiveness of our GAE as well as the correctness of the proposed theoretical analysis, and it also suggests that GAE is a superior solution to the current deep representation learning techniques comparing with variant auto-encoders and existing local invariant methods.
On the sizes of expander graphs and minimum distances of graph codes
DEFF Research Database (Denmark)
Høholdt, Tom; Justesen, Jørn
2014-01-01
We give lower bounds for the minimum distances of graph codes based on expander graphs. The bounds depend only on the second eigenvalue of the graph and the parameters of the component codes. We also give an upper bound on the size of a degree regular graph with given second eigenvalue....
McMillen, Sue; McMillen, Beth
2010-01-01
Connecting stories to qualitative coordinate graphs has been suggested as an effective instructional strategy. Even students who are able to "create" bar graphs may struggle to correctly "interpret" them. Giving children opportunities to work with qualitative graphs can help them develop the skills to interpret, describe, and compare information…
The groupies of random multipartite graphs
Portmann, Marius; Wang, Hongyun
2012-01-01
If a vertex $v$ in a graph $G$ has degree larger than the average of the degrees of its neighbors, we call it a groupie in $G$. In the current work, we study the behavior of groupie in random multipartite graphs with the link probability between sets of nodes fixed. Our results extend the previous ones on random (bipartite) graphs.
Modeling Software Evolution using Algebraic Graph Rewriting
Ciraci, Selim; van den Broek, Pim
We show how evolution requests can be formalized using algebraic graph rewriting. In particular, we present a way to convert the UML class diagrams to colored graphs. Since changes in software may effect the relation between the methods of classes, our colored graph representation also employs the
A Type Graph Model for Java Programs
Rensink, Arend; Zambon, Eduardo
2009-01-01
In this report we present a type graph that models all executable constructs of the Java programming language. Such a model is useful for any graph-based technique that relies on a representation of Java programs as graphs. The model can be regarded as a common representation to which all Java
An intersection graph of straight lines
DEFF Research Database (Denmark)
Thomassen, Carsten
2002-01-01
G. Ehrlich, S. Even, and R.E. Tarjan conjectured that the graph obtained from a complete 3 partite graph K4,4,4 by deleting the edges of four disjoint triangles is not the intersection graph of straight line segments in the plane. We show that it is....
Girth 5 graphs from relative difference sets
DEFF Research Database (Denmark)
Jørgensen, Leif Kjær
2005-01-01
We consider the problem of construction of graphs with given degree $k$ and girth 5 and as few vertices as possible. We give a construction of a family of girth 5 graphs based on relative difference sets. This family contains the smallest known graph of degree 8 and girth 5 which was constructed ...
Cycles in weighted graphs and related topics
Zhang, Shenggui
2002-01-01
This thesis contains results on paths andcycles in graphs andon a more or less relatedtopic, the vulnerability of graphs. In the first part of the thesis, Chapters 2 through 5, we concentrate on paths andcycles in weightedgraphs. A number of sufficient conditions are presentedfor graphs to contain
Graph Transformation Semantics for a QVT Language
Rensink, Arend; Nederpel, Ronald; Bruni, Roberto; Varró, Dániel
It has been claimed by many in the graph transformation community that model transformation, as understood in the context of Model Driven Architecture, can be seen as an application of graph transformation. In this paper we substantiate this claim by giving a graph transformation-based semantics to
Girth 5 graphs from relative difference sets
DEFF Research Database (Denmark)
Jørgensen, Leif Kjær
We consider the problem of construction of graphs with given degree and girth 5 and as few vertices as possible. We give a construction of a family of girth 5 graphs based on relative difference sets. This family contains the smallest known graph of degree 8 and girth 5 which was constructed by G...
Improper colouring of (random) unit disk graphs
Kang, R.J.; Müller, T.; Sereni, J.S.
2008-01-01
For any graph G, the k-improper chromatic number ¿k(G) is the smallest number of colours used in a colouring of G such that each colour class induces a subgraph of maximum degree k. We investigate ¿k for unit disk graphs and random unit disk graphs to generalise results of McDiarmid and Reed
Alliances and Bisection Width for Planar Graphs
DEFF Research Database (Denmark)
Olsen, Martin; Revsbæk, Morten
2013-01-01
An alliance in a graph is a set of vertices (allies) such that each vertex in the alliance has at least as many allies (counting the vertex itself) as non-allies in its neighborhood of the graph. We show that any planar graph with minimum degree at least 4 can be split into two alliances in polyn...
A Type Graph Model for Java Programs
Rensink, Arend; Zambon, Eduardo; Lee, D.; Lopes, A.; Poetzsch-Heffter, A.
2009-01-01
In this work we present a type graph that models all executable constructs of the Java programming language. Such a model is useful for any graph-based technique that relies on a representation of Java programs as graphs. The model can be regarded as a common representation to which all Java syntax
RATGRAPH: Computer Graphing of Rational Functions.
Minch, Bradley A.
1987-01-01
Presents an easy-to-use Applesoft BASIC program that graphs rational functions and any asymptotes that the functions might have. Discusses the nature of rational functions, graphing them manually, employing a computer to graph rational functions, and describes how the program works. (TW)
Well-covered graphs and factors
DEFF Research Database (Denmark)
Randerath, Bert; Vestergaard, Preben D.
2006-01-01
A maximum independent set of vertices in a graph is a set of pairwise nonadjacent vertices of largest cardinality α. Plummer defined a graph to be well-covered, if every independent set is contained in a maximum independent set of G. Every well-covered graph G without isolated vertices has a perf...
A new characterization of trivially perfect graphs
Directory of Open Access Journals (Sweden)
Christian Rubio Montiel
2015-03-01
Full Text Available A graph $G$ is \\emph{trivially perfect} if for every induced subgraph the cardinality of the largest set of pairwise nonadjacent vertices (the stability number $\\alpha(G$ equals the number of (maximal cliques $m(G$. We characterize the trivially perfect graphs in terms of vertex-coloring and we extend some definitions to infinite graphs.
47 CFR 80.761 - Conversion graphs.
2010-10-01
... MARITIME SERVICES Standards for Computing Public Coast Station VHF Coverage § 80.761 Conversion graphs. The following graphs must be employed where conversion from one to the other of the indicated types of units is... 47 Telecommunication 5 2010-10-01 2010-10-01 false Conversion graphs. 80.761 Section 80.761...
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
Congdon, Peter
2014-01-01
This book provides an accessible approach to Bayesian computing and data analysis, with an emphasis on the interpretation of real data sets. Following in the tradition of the successful first edition, this book aims to make a wide range of statistical modeling applications accessible using tested code that can be readily adapted to the reader's own applications. The second edition has been thoroughly reworked and updated to take account of advances in the field. A new set of worked examples is included. The novel aspect of the first edition was the coverage of statistical modeling using WinBU
Dirichlet Process Parsimonious Mixtures for clustering
Chamroukhi, Faicel; Bartcus, Marius; Glotin, Hervé
2015-01-01
The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixtur...
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Searching Algorithm Using Bayesian Updates
Caudle, Kyle
2010-01-01
In late October 1967, the USS Scorpion was lost at sea, somewhere between the Azores and Norfolk Virginia. Dr. Craven of the U.S. Navy's Special Projects Division is credited with using Bayesian Search Theory to locate the submarine. Bayesian Search Theory is a straightforward and interesting application of Bayes' theorem which involves searching…
Bayesian Data Analysis (lecture 2)
CERN. Geneva
2018-01-01
framework but we will also go into more detail and discuss for example the role of the prior. The second part of the lecture will cover further examples and applications that heavily rely on the bayesian approach, as well as some computational tools needed to perform a bayesian analysis.
Bayesian Data Analysis (lecture 1)
CERN. Geneva
2018-01-01
framework but we will also go into more detail and discuss for example the role of the prior. The second part of the lecture will cover further examples and applications that heavily rely on the bayesian approach, as well as some computational tools needed to perform a bayesian analysis.
On a conjecture concerning helly circle graphs
Directory of Open Access Journals (Sweden)
Durán Guillermo
2003-01-01
Full Text Available We say that G is an e-circle graph if there is a bijection between its vertices and straight lines on the cartesian plane such that two vertices are adjacent in G if and only if the corresponding lines intersect inside the circle of radius one. This definition suggests a method for deciding whether a given graph G is an e-circle graph, by constructing a convenient system S of equations and inequations which represents the structure of G, in such a way that G is an e-circle graph if and only if S has a solution. In fact, e-circle graphs are exactly the circle graphs (intersection graphs of chords in a circle, and thus this method provides an analytic way for recognizing circle graphs. A graph G is a Helly circle graph if G is a circle graph and there exists a model of G by chords such that every three pairwise intersecting chords intersect at the same point. A conjecture by Durán (2000 states that G is a Helly circle graph if and only if G is a circle graph and contains no induced diamonds (a diamond is a graph formed by four vertices and five edges. Many unsuccessful efforts - mainly based on combinatorial and geometrical approaches - have been done in order to validate this conjecture. In this work, we utilize the ideas behind the definition of e-circle graphs and restate this conjecture in terms of an equivalence between two systems of equations and inequations, providing a new, analytic tool to deal with it.
Centrosymmetric Graphs And A Lower Bound For Graph Energy Of Fullerenes
Directory of Open Access Journals (Sweden)
Katona Gyula Y.
2014-11-01
Full Text Available The energy of a molecular graph G is defined as the summation of the absolute values of the eigenvalues of adjacency matrix of a graph G. In this paper, an infinite class of fullerene graphs with 10n vertices, n ≥ 2, is considered. By proving centrosymmetricity of the adjacency matrix of these fullerene graphs, a lower bound for its energy is given. Our method is general and can be extended to other class of fullerene graphs.
The Bayesian Covariance Lasso.
Khondker, Zakaria S; Zhu, Hongtu; Chu, Haitao; Lin, Weili; Ibrahim, Joseph G
2013-04-01
Estimation of sparse covariance matrices and their inverse subject to positive definiteness constraints has drawn a lot of attention in recent years. The abundance of high-dimensional data, where the sample size ( n ) is less than the dimension ( d ), requires shrinkage estimation methods since the maximum likelihood estimator is not positive definite in this case. Furthermore, when n is larger than d but not sufficiently larger, shrinkage estimation is more stable than maximum likelihood as it reduces the condition number of the precision matrix. Frequentist methods have utilized penalized likelihood methods, whereas Bayesian approaches rely on matrix decompositions or Wishart priors for shrinkage. In this paper we propose a new method, called the Bayesian Covariance Lasso (BCLASSO), for the shrinkage estimation of a precision (covariance) matrix. We consider a class of priors for the precision matrix that leads to the popular frequentist penalties as special cases, develop a Bayes estimator for the precision matrix, and propose an efficient sampling scheme that does not precalculate boundaries for positive definiteness. The proposed method is permutation invariant and performs shrinkage and estimation simultaneously for non-full rank data. Simulations show that the proposed BCLASSO performs similarly as frequentist methods for non-full rank data.
Bayesian dynamic mediation analysis.
Huang, Jing; Yuan, Ying
2017-12-01
Most existing methods for mediation analysis assume that mediation is a stationary, time-invariant process, which overlooks the inherently dynamic nature of many human psychological processes and behavioral activities. In this article, we consider mediation as a dynamic process that continuously changes over time. We propose Bayesian multilevel time-varying coefficient models to describe and estimate such dynamic mediation effects. By taking the nonparametric penalized spline approach, the proposed method is flexible and able to accommodate any shape of the relationship between time and mediation effects. Simulation studies show that the proposed method works well and faithfully reflects the true nature of the mediation process. By modeling mediation effect nonparametrically as a continuous function of time, our method provides a valuable tool to help researchers obtain a more complete understanding of the dynamic nature of the mediation process underlying psychological and behavioral phenomena. We also briefly discuss an alternative approach of using dynamic autoregressive mediation model to estimate the dynamic mediation effect. The computer code is provided to implement the proposed Bayesian dynamic mediation analysis. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Approximate Bayesian computation.
Directory of Open Access Journals (Sweden)
Mikael Sunnåker
Full Text Available Approximate Bayesian computation (ABC constitutes a class of computational methods rooted in Bayesian statistics. In all model-based statistical inference, the likelihood function is of central importance, since it expresses the probability of the observed data under a particular statistical model, and thus quantifies the support data lend to particular values of parameters and to choices among different models. For simple models, an analytical formula for the likelihood function can typically be derived. However, for more complex models, an analytical formula might be elusive or the likelihood function might be computationally very costly to evaluate. ABC methods bypass the evaluation of the likelihood function. In this way, ABC methods widen the realm of models for which statistical inference can be considered. ABC methods are mathematically well-founded, but they inevitably make assumptions and approximations whose impact needs to be carefully assessed. Furthermore, the wider application domain of ABC exacerbates the challenges of parameter estimation and model selection. ABC has rapidly gained popularity over the last years and in particular for the analysis of complex problems arising in biological sciences (e.g., in population genetics, ecology, epidemiology, and systems biology.
Clustering User Behavior in Scientific Collections
Blixhavn, Øystein Hoel
2014-01-01
This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algor...
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Subsampling for graph power spectrum estimation
Chepuri, Sundeep Prabhakar; Leus, Geert
2016-01-01
In this paper we focus on subsampling stationary random signals that reside on the vertices of undirected graphs. Second-order stationary graph signals are obtained by filtering white noise and they admit a well-defined power spectrum. Estimating the graph power spectrum forms a central component of stationary graph signal processing and related inference tasks. We show that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the power spectrum of the graph signal from the subsampled observations, without any spectral priors. In addition, a near-optimal greedy algorithm is developed to design the subsampling scheme.
On 4-critical t-perfect graphs
Benchetrit, Yohann
2016-01-01
It is an open question whether the chromatic number of $t$-perfect graphs is bounded by a constant. The largest known value for this parameter is 4, and the only example of a 4-critical $t$-perfect graph, due to Laurent and Seymour, is the complement of the line graph of the prism $\\Pi$ (a graph is 4-critical if it has chromatic number 4 and all its proper induced subgraphs are 3-colorable). In this paper, we show a new example of a 4-critical $t$-perfect graph: the complement of the line gra...
Subsampling for graph power spectrum estimation
Chepuri, Sundeep Prabhakar
2016-10-06
In this paper we focus on subsampling stationary random signals that reside on the vertices of undirected graphs. Second-order stationary graph signals are obtained by filtering white noise and they admit a well-defined power spectrum. Estimating the graph power spectrum forms a central component of stationary graph signal processing and related inference tasks. We show that by sampling a significantly smaller subset of vertices and using simple least squares, we can reconstruct the power spectrum of the graph signal from the subsampled observations, without any spectral priors. In addition, a near-optimal greedy algorithm is developed to design the subsampling scheme.
Proving relations between modular graph functions
International Nuclear Information System (INIS)
Basu, Anirban
2016-01-01
We consider modular graph functions that arise in the low energy expansion of the four graviton amplitude in type II string theory. The vertices of these graphs are the positions of insertions of vertex operators on the toroidal worldsheet, while the links are the scalar Green functions connecting the vertices. Graphs with four and five links satisfy several non-trivial relations, which have been proved recently. We prove these relations by using elementary properties of Green functions and the details of the graphs. We also prove a relation between modular graph functions with six links. (paper)
Reproducibility of graph metrics of human brain functional networks.
Deuker, Lorena; Bullmore, Edward T; Smith, Marie; Christensen, Soren; Nathan, Pradeep J; Rockstroh, Brigitte; Bassett, Danielle S
2009-10-01
Graph theory provides many metrics of complex network organization that can be applied to analysis of brain networks derived from neuroimaging data. Here we investigated the test-retest reliability of graph metrics of functional networks derived from magnetoencephalography (MEG) data recorded in two sessions from 16 healthy volunteers who were studied at rest and during performance of the n-back working memory task in each session. For each subject's data at each session, we used a wavelet filter to estimate the mutual information (MI) between each pair of MEG sensors in each of the classical frequency intervals from gamma to low delta in the overall range 1-60 Hz. Undirected binary graphs were generated by thresholding the MI matrix and 8 global network metrics were estimated: the clustering coefficient, path length, small-worldness, efficiency, cost-efficiency, assortativity, hierarchy, and synchronizability. Reliability of each graph metric was assessed using the intraclass correlation (ICC). Good reliability was demonstrated for most metrics applied to the n-back data (mean ICC=0.62). Reliability was greater for metrics in lower frequency networks. Higher frequency gamma- and beta-band networks were less reliable at a global level but demonstrated high reliability of nodal metrics in frontal and parietal regions. Performance of the n-back task was associated with greater reliability than measurements on resting state data. Task practice was also associated with greater reliability. Collectively these results suggest that graph metrics are sufficiently reliable to be considered for future longitudinal studies of functional brain network changes.
Xiong, B.; Oude Elberink, S.; Vosselman, G.
2014-07-01
In the task of 3D building model reconstruction from point clouds we face the problem of recovering a roof topology graph in the presence of noise, small roof faces and low point densities. Errors in roof topology graphs will seriously affect the final modelling results. The aim of this research is to automatically correct these errors. We define the graph correction as a graph-to-graph problem, similar to the spelling correction problem (also called the string-to-string problem). The graph correction is more complex than string correction, as the graphs are 2D while strings are only 1D. We design a strategy based on a dictionary of graph edit operations to automatically identify and correct the errors in the input graph. For each type of error the graph edit dictionary stores a representative erroneous subgraph as well as the corrected version. As an erroneous roof topology graph may contain several errors, a heuristic search is applied to find the optimum sequence of graph edits to correct the errors one by one. The graph edit dictionary can be expanded to include entries needed to cope with errors that were previously not encountered. Experiments show that the dictionary with only fifteen entries already properly corrects one quarter of erroneous graphs in about 4500 buildings, and even half of the erroneous graphs in one test area, achieving as high as a 95% acceptance rate of the reconstructed models.
Significance evaluation in factor graphs
DEFF Research Database (Denmark)
Madsen, Tobias; Hobolth, Asger; Jensen, Jens Ledet
2017-01-01
in genomics and the multiple-testing issues accompanying them, accurate significance evaluation is of great importance. We here address the problem of evaluating statistical significance of observations from factor graph models. Results Two novel numerical approximations for evaluation of statistical...... significance are presented. First a method using importance sampling. Second a saddlepoint approximation based method. We develop algorithms to efficiently compute the approximations and compare them to naive sampling and the normal approximation. The individual merits of the methods are analysed both from....... Conclusions The applicability of saddlepoint approximation and importance sampling is demonstrated on known models in the factor graph framework. Using the two methods we can substantially improve computational cost without compromising accuracy. This contribution allows analyses of large datasets...
Groups, graphs and random walks
Salvatori, Maura; Sava-Huss, Ecaterina
2017-01-01
An accessible and panoramic account of the theory of random walks on groups and graphs, stressing the strong connections of the theory with other branches of mathematics, including geometric and combinatorial group theory, potential analysis, and theoretical computer science. This volume brings together original surveys and research-expository papers from renowned and leading experts, many of whom spoke at the workshop 'Groups, Graphs and Random Walks' celebrating the sixtieth birthday of Wolfgang Woess in Cortona, Italy. Topics include: growth and amenability of groups; Schrödinger operators and symbolic dynamics; ergodic theorems; Thompson's group F; Poisson boundaries; probability theory on buildings and groups of Lie type; structure trees for edge cuts in networks; and mathematical crystallography. In what is currently a fast-growing area of mathematics, this book provides an up-to-date and valuable reference for both researchers and graduate students, from which future research activities will undoubted...
Flux networks in metabolic graphs
International Nuclear Information System (INIS)
Warren, P B; Queiros, S M Duarte; Jones, J L
2009-01-01
A metabolic model can be represented as a bipartite graph comprising linked reaction and metabolite nodes. Here it is shown how a network of conserved fluxes can be assigned to the edges of such a graph by combining the reaction fluxes with a conserved metabolite property such as molecular weight. A similar flux network can be constructed by combining the primal and dual solutions to the linear programming problem that typically arises in constraint-based modelling. Such constructions may help with the visualization of flux distributions in complex metabolic networks. The analysis also explains the strong correlation observed between metabolite shadow prices (the dual linear programming variables) and conserved metabolite properties. The methods were applied to recent metabolic models for Escherichia coli, Saccharomyces cerevisiae and Methanosarcina barkeri. Detailed results are reported for E. coli; similar results were found for other organisms
3-biplacement of bipartite graphs
Directory of Open Access Journals (Sweden)
Lech Adamus
2008-01-01
Full Text Available Let \\(G=(L,R;E\\ be a bipartite graph with color classes \\(L\\ and \\(R\\ and edge set \\(E\\. A set of two bijections \\(\\{\\varphi_1 , \\varphi_2\\}\\, \\(\\varphi_1 , \\varphi_2 :L \\cup R \\to L \\cup R\\, is said to be a \\(3\\-biplacement of \\(G\\ if \\(\\varphi_1(L= \\varphi_2(L = L\\ and \\(E \\cap \\varphi_1^*(E=\\emptyset\\, \\(E \\cap \\varphi_2^*(E=\\emptyset\\, \\(\\varphi_1^*(E \\cap \\varphi_2^*(E=\\emptyset\\, where \\(\\varphi_1^*\\, \\(\\varphi_2^*\\ are the maps defined on \\(E\\, induced by \\(\\varphi_1\\, \\(\\varphi_2\\, respectively. We prove that if \\(|L| = p\\, \\(|R| = q\\, \\(3 \\leq p \\leq q\\, then every graph \\(G=(L,R;E\\ of size at most \\(p\\ has a \\(3\\-biplacement.
On the centrality of some graphs
Directory of Open Access Journals (Sweden)
Vecdi Aytac
2017-10-01
Full Text Available A central issue in the analysis of complex networks is the assessment of their stability and vulnerability. A variety of measures have been proposed in the literature to quantify the stability of networks and a number of graph-theoretic parameters have been used to derive formulas for calculating network reliability. Different measures for graph vulnerability have been introduced so far to study different aspects of the graph behavior after removal of vertices or links such as connectivity, toughness, scattering number, binding number, residual closeness and integrity. In this paper, we consider betweenness centrality of a graph. Betweenness centrality of a vertex of a graph is portion of the shortest paths all pairs of vertices passing through a given vertex. In this paper, we obtain exact values for betweenness centrality for some wheel related graphs namely gear, helm, sunflower and friendship graphs.
Quantum walk on a chimera graph
Xu, Shu; Sun, Xiangxiang; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum; Sanders, Barry C.
2018-05-01
We analyse a continuous-time quantum walk on a chimera graph, which is a graph of choice for designing quantum annealers, and we discover beautiful quantum walk features such as localization that starkly distinguishes classical from quantum behaviour. Motivated by technological thrusts, we study continuous-time quantum walk on enhanced variants of the chimera graph and on diminished chimera graph with a random removal of vertices. We explain the quantum walk by constructing a generating set for a suitable subgroup of graph isomorphisms and corresponding symmetry operators that commute with the quantum walk Hamiltonian; the Hamiltonian and these symmetry operators provide a complete set of labels for the spectrum and the stationary states. Our quantum walk characterization of the chimera graph and its variants yields valuable insights into graphs used for designing quantum-annealers.
Fibonacci number of the tadpole graph
Directory of Open Access Journals (Sweden)
Joe DeMaio
2014-10-01
Full Text Available In 1982, Prodinger and Tichy defined the Fibonacci number of a graph G to be the number of independent sets of the graph G. They did so since the Fibonacci number of the path graph Pn is the Fibonacci number F(n+2 and the Fibonacci number of the cycle graph Cn is the Lucas number Ln. The tadpole graph Tn,k is the graph created by concatenating Cn and Pk with an edge from any vertex of Cn to a pendant of Pk for integers n=3 and k=0. This paper establishes formulae and identities for the Fibonacci number of the tadpole graph via algebraic and combinatorial methods.
Software for Graph Analysis and Visualization
Directory of Open Access Journals (Sweden)
M. I. Kolomeychenko
2014-01-01
Full Text Available This paper describes the software for graph storage, analysis and visualization. The article presents a comparative analysis of existing software for analysis and visualization of graphs, describes the overall architecture of application and basic principles of construction and operation of the main modules. Furthermore, a description of the developed graph storage oriented to storage and processing of large-scale graphs is presented. The developed algorithm for finding communities and implemented algorithms of autolayouts of graphs are the main functionality of the product. The main advantage of the developed software is high speed processing of large size networks (up to millions of nodes and links. Moreover, the proposed graph storage architecture is unique and has no analogues. The developed approaches and algorithms are optimized for operating with big graphs and have high productivity.
A Parallel Approach for Frequent Subgraph Mining in a Single Large Graph Using Spark
Directory of Open Access Journals (Sweden)
Fengcai Qiao
2018-02-01
Full Text Available Frequent subgraph mining (FSM plays an important role in graph mining, attracting a great deal of attention in many areas, such as bioinformatics, web data mining and social networks. In this paper, we propose SSiGraM (Spark based Single Graph Mining, a Spark based parallel frequent subgraph mining algorithm in a single large graph. Aiming to approach the two computational challenges of FSM, we conduct the subgraph extension and support evaluation parallel across all the distributed cluster worker nodes. In addition, we also employ a heuristic search strategy and three novel optimizations: load balancing, pre-search pruning and top-down pruning in the support evaluation process, which significantly improve the performance. Extensive experiments with four different real-world datasets demonstrate that the proposed algorithm outperforms the existing GraMi (Graph Mining algorithm by an order of magnitude for all datasets and can work with a lower support threshold.
Parallel External Memory Graph Algorithms
DEFF Research Database (Denmark)
Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari
2010-01-01
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Submanifolds weakly associated with graphs
Indian Academy of Sciences (India)
A CARRIAZO, L M FERN ´ANDEZ and A RODRÍGUEZ-HIDALGO. Department of Geometry and Topology, ..... by means of trees (connected graphs without cycles) and forests (disjoint unions of trees, see [6]) given in [3], by extending it to weak ... CR-submanifold. In this case, every tree is a K2. Finally, Theorem 3.8 of [3] can ...
Borsboom, D.; Haig, B.D.
2013-01-01
Unlike most other statistical frameworks, Bayesian statistical inference is wedded to a particular approach in the philosophy of science (see Howson & Urbach, 2006); this approach is called Bayesianism. Rather than being concerned with model fitting, this position in the philosophy of science
Quantum information processing with graph states
International Nuclear Information System (INIS)
Schlingemann, Dirk-Michael
2005-04-01
Graph states are multiparticle states which are associated with graphs. Each vertex of the graph corresponds to a single system or particle. The links describe quantum correlations (entanglement) between pairs of connected particles. Graph states were initiated independently by two research groups: On the one hand, graph states were introduced by Briegel and Raussendorf as a resource for a new model of one-way quantum computing, where algorithms are implemented by a sequence of measurements at single particles. On the other hand, graph states were developed by the author of this thesis and ReinhardWerner in Braunschweig, as a tool to build quantum error correcting codes, called graph codes. The connection between the two approaches was fully realized in close cooperation of both research groups. This habilitation thesis provides a survey of the theory of graph codes, focussing mainly, but not exclusively on the author's own research work. We present the theoretical and mathematical background for the analysis of graph codes. The concept of one-way quantum computing for general graph states is discussed. We explicitly show how to realize the encoding and decoding device of a graph code on a one-way quantum computer. This kind of implementation is to be seen as a mathematical description of a quantum memory device. In addition to that, we investigate interaction processes, which enable the creation of graph states on very large systems. Particular graph states can be created, for instance, by an Ising type interaction between next neighbor particles which sits at the points of an infinitely extended cubic lattice. Based on the theory of quantum cellular automata, we give a constructive characterization of general interactions which create a translationally invariant graph state. (orig.)
Graph-based Geospatial Prediction and Clustering for Situation Recognition
Tang, Mengfan
2017-01-01
Big data continues to grow and diversify at an increasing pace. To understand constantly evolving situations, data is collected from various location-based sensors as well as people using effective participatory sensing. Static sensors are placed at particular locations, monitoring and measuring important variables from the environment. Additionally, people contribute data in the form of mobile streams through participatory sensing. To process such disparate data for situation recognition, we...
Clustering Single-Cell Expression Data Using Random Forest Graphs.
Pouyan, Maziyar Baran; Nourani, Mehrdad
2017-07-01
Complex tissues such as brain and bone marrow are made up of multiple cell types. As the study of biological tissue structure progresses, the role of cell-type-specific research becomes increasingly important. Novel sequencing technology such as single-cell cytometry provides researchers access to valuable biological data. Applying machine-learning techniques to these high-throughput datasets provides deep insights into the cellular landscape of the tissue where those cells are a part of. In this paper, we propose the use of random-forest-based single-cell profiling, a new machine-learning-based technique, to profile different cell types of intricate tissues using single-cell cytometry data. Our technique utilizes random forests to capture cell marker dependences and model the cellular populations using the cell network concept. This cellular network helps us discover what cell types are in the tissue. Our experimental results on public-domain datasets indicate promising performance and accuracy of our technique in extracting cell populations of complex tissues.
A graph-clustering approach to search important molecular markers ...
African Journals Online (AJOL)
Here, we performed a comprehensive gene level assessment of Parkinson's disease using 16 colorectal cancer samples and nine normal samples. The results show that SLC6A3, SLC18A2, and EN1, etc., are related to Parkinson's disease. Besides, we further mined the underlying molecular mechanism within these ...
A graph-clustering approach to search important molecular markers ...
African Journals Online (AJOL)
use
2011-11-07
Nov 7, 2011 ... inclusions and a movement disorder (Chen and Feany,. 2005). LRRK2 ... The gene ontology (Ashburner et al., 2000) project is a major bioinformatics initiative ... et al. 15657 particular organisms (http://www.genome.jp/kegg/).
A graph-Laplacian-based feature extraction algorithm for neural spike sorting.
Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos
2009-01-01
Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.
Degree Associated Edge Reconstruction Number of Graphs with Regular Pruned Graph
Directory of Open Access Journals (Sweden)
P. Anusha Devi
2015-10-01
Full Text Available An ecard of a graph $G$ is a subgraph formed by deleting an edge. A da-ecard specifies the degree of the deleted edge along with the ecard. The degree associated edge reconstruction number of a graph $G,~dern(G,$ is the minimum number of da-ecards that uniquely determines $G.$ The adversary degree associated edge reconstruction number of a graph $G, adern(G,$ is the minimum number $k$ such that every collection of $k$ da-ecards of $G$ uniquely determines $G.$ The maximal subgraph without end vertices of a graph $G$ which is not a tree is the pruned graph of $G.$ It is shown that $dern$ of complete multipartite graphs and some connected graphs with regular pruned graph is $1$ or $2.$ We also determine $dern$ and $adern$ of corona product of standard graphs.
International Nuclear Information System (INIS)
Rajabalinejad, M.
2010-01-01
To reduce cost of Monte Carlo (MC) simulations for time-consuming processes, Bayesian Monte Carlo (BMC) is introduced in this paper. The BMC method reduces number of realizations in MC according to the desired accuracy level. BMC also provides a possibility of considering more priors. In other words, different priors can be integrated into one model by using BMC to further reduce cost of simulations. This study suggests speeding up the simulation process by considering the logical dependence of neighboring points as prior information. This information is used in the BMC method to produce a predictive tool through the simulation process. The general methodology and algorithm of BMC method are presented in this paper. The BMC method is applied to the simplified break water model as well as the finite element model of 17th Street Canal in New Orleans, and the results are compared with the MC and Dynamic Bounds methods.
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Book review: Bayesian analysis for population ecology
Link, William A.
2011-01-01
Brian Dennis described the field of ecology as “fertile, uncolonized ground for Bayesian ideas.” He continued: “The Bayesian propagule has arrived at the shore. Ecologists need to think long and hard about the consequences of a Bayesian ecology. The Bayesian outlook is a successful competitor, but is it a weed? I think so.” (Dennis 2004)
Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory
Pearce, Roger; Gokhale, Maya; Amato, Nancy M.
2013-01-01
We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash
Robustifying Bayesian nonparametric mixtures for count data.
Canale, Antonio; Prünster, Igor
2017-03-01
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.
Replica methods for loopy sparse random graphs
International Nuclear Information System (INIS)
Coolen, ACC
2016-01-01
I report on the development of a novel statistical mechanical formalism for the analysis of random graphs with many short loops, and processes on such graphs. The graphs are defined via maximum entropy ensembles, in which both the degrees (via hard constraints) and the adjacency matrix spectrum (via a soft constraint) are prescribed. The sum over graphs can be done analytically, using a replica formalism with complex replica dimensions. All known results for tree-like graphs are recovered in a suitable limit. For loopy graphs, the emerging theory has an appealing and intuitive structure, suggests how message passing algorithms should be adapted, and what is the structure of theories describing spin systems on loopy architectures. However, the formalism is still largely untested, and may require further adjustment and refinement. (paper)
Chemical Graph Transformation with Stereo-Information
DEFF Research Database (Denmark)
Andersen, Jakob Lykke; Flamm, Christoph; Merkle, Daniel
2017-01-01
Double Pushout graph transformation naturally facilitates the modelling of chemical reactions: labelled undirected graphs model molecules and direct derivations model chemical reactions. However, the most straightforward modelling approach ignores the relative placement of atoms and their neighbo......Double Pushout graph transformation naturally facilitates the modelling of chemical reactions: labelled undirected graphs model molecules and direct derivations model chemical reactions. However, the most straightforward modelling approach ignores the relative placement of atoms...... and their neighbours in space. Stereoisomers of chemical compounds thus cannot be distinguished, even though their chemical activity may differ substantially. In this contribution we propose an extended chemical graph transformation system with attributes that encode information about local geometry. The modelling...... of graph transformation, but we here propose a framework that also allows for partially specified stereoinformation. While there are several stereochemical configurations to be considered, we focus here on the tetrahedral molecular shape, and suggest general principles for how to treat all other chemically...
Reconstructing Topological Graphs and Continua
Gartside, Paul; Pitz, Max F.; Suabedissen, Rolf
2015-01-01
The deck of a topological space $X$ is the set $\\mathcal{D}(X)=\\{[X \\setminus \\{x\\}] \\colon x \\in X\\}$, where $[Z]$ denotes the homeomorphism class of $Z$. A space $X$ is topologically reconstructible if whenever $\\mathcal{D}(X)=\\mathcal{D}(Y)$ then $X$ is homeomorphic to $Y$. It is shown that all metrizable compact connected spaces are reconstructible. It follows that all finite graphs, when viewed as a 1-dimensional cell-complex, are reconstructible in the topological sense, and more genera...
Decomposing a graph into bistars
DEFF Research Database (Denmark)
Thomassen, Carsten
2013-01-01
Bárat and the present author conjectured that, for each tree T, there exists a natural number kT such that the following holds: If G is a kT-edge-connected graph such that |E(T)| divides |E(G)|, then G has a T-decomposition, that is, a decomposition of the edge set into trees each of which...... is isomorphic to T. The conjecture has been verified for infinitely many paths and for each star. In this paper we verify the conjecture for an infinite family of trees that are neither paths nor stars, namely all the bistars S(k,k+1)....
Bayesian image restoration, using configurations
Thorarinsdottir, Thordis
2006-01-01
In this paper, we develop a Bayesian procedure for removing noise from images that can be viewed as noisy realisations of random sets in the plane. The procedure utilises recent advances in configuration theory for noise free random sets, where the probabilities of observing the different boundary configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the re...
Bayesian Networks and Influence Diagrams
DEFF Research Database (Denmark)
Kjærulff, Uffe Bro; Madsen, Anders Læsø
Probabilistic networks, also known as Bayesian networks and influence diagrams, have become one of the most promising technologies in the area of applied artificial intelligence, offering intuitive, efficient, and reliable methods for diagnosis, prediction, decision making, classification......, troubleshooting, and data mining under uncertainty. Bayesian Networks and Influence Diagrams: A Guide to Construction and Analysis provides a comprehensive guide for practitioners who wish to understand, construct, and analyze intelligent systems for decision support based on probabilistic networks. Intended...
On path hypercompositions in graphs and automata
Directory of Open Access Journals (Sweden)
Massouros Christos G.
2016-01-01
Full Text Available The paths in graphs define hypercompositions in the set of their vertices and therefore it is feasible to associate hypercompositional structures to each graph. Similarly, the strings of letters from their alphabet, define hypercompositions in the automata, which in turn define the associated hypergroups to the automata. The study of the associated hypercompositional structures gives results in both, graphs and automata theory.
Attack Graph Construction for Security Events Analysis
Directory of Open Access Journals (Sweden)
Andrey Alexeevich Chechulin
2014-09-01
Full Text Available The paper is devoted to investigation of the attack graphs construction and analysis task for a network security evaluation and real-time security event processing. Main object of this research is the attack modeling process. The paper contains the description of attack graphs building, modifying and analysis technique as well as overview of implemented prototype for network security analysis based on attack graph approach.
Steiner Distance in Graphs--A Survey
Mao, Yaping
2017-01-01
For a connected graph $G$ of order at least $2$ and $S\\subseteq V(G)$, the \\emph{Steiner distance} $d_G(S)$ among the vertices of $S$ is the minimum size among all connected subgraphs whose vertex sets contain $S$. In this paper, we summarize the known results on the Steiner distance parameters, including Steiner distance, Steiner diameter, Steiner center, Steiner median, Steiner interval, Steiner distance hereditary graph, Steiner distance stable graph, average Steiner distance, and Steiner ...
Density conditions for triangles in multipartite graphs
DEFF Research Database (Denmark)
Bondy, Adrian; Shen, Jin; Thomassé, Stephan
2006-01-01
subgraphs in G. We investigate in particular the case where G is a complete multipartite graph. We prove that a finite tripartite graph with all edge densities greater than the golden ratio has a triangle and that this bound is best possible. Also we show that an infinite-partite graph with finite parts has...... a triangle, provided that the edge density between any two parts is greater than 1/2....
Efficient Algorithmic Frameworks via Structural Graph Theory
2016-10-28
constant. For example, they measured that, on large samples of the entire network, the Amazon graph has average degree 17.7, the Facebook graph has average...department heads’ opinions of departments, and generally lack transparency and well-defined measures . On the other hand, the National Research Council (the...Efficient and practical resource block allocation for LTE -based D2D network via graph coloring. Wireless Networks 20(4): 611-624 (2014) 50. Hossein
Overlapping clusters for distributed computation.
Energy Technology Data Exchange (ETDEWEB)
Mirrokni, Vahab (Google Research, New York, NY); Andersen, Reid (Microsoft Corporation, Redmond, WA); Gleich, David F.
2010-11-01
Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.
Decomposing a planar graph into an independent set and a 3-degenerate graph
DEFF Research Database (Denmark)
Thomassen, Carsten
2001-01-01
We prove the conjecture made by O. V. Borodin in 1976 that the vertex set of every planar graph can be decomposed into an independent set and a set inducing a 3-degenerate graph. (C) 2001 Academic Press....
Graph algorithms in the titan toolkit.
Energy Technology Data Exchange (ETDEWEB)
McLendon, William Clarence, III; Wylie, Brian Neil
2009-10-01
Graph algorithms are a key component in a wide variety of intelligence analysis activities. The Graph-Based Informatics for Non-Proliferation and Counter-Terrorism project addresses the critical need of making these graph algorithms accessible to Sandia analysts in a manner that is both intuitive and effective. Specifically we describe the design and implementation of an open source toolkit for doing graph analysis, informatics, and visualization that provides Sandia with novel analysis capability for non-proliferation and counter-terrorism.
Xu, Kexiang; Trinajstić, Nenad
2015-01-01
This is the first book to focus on the topological index, the Harary index, of a graph, including its mathematical properties, chemical applications and some related and attractive open problems. This book is dedicated to Professor Frank Harary (1921—2005), the grandmaster of graph theory and its applications. It has be written by experts in the field of graph theory and its applications. For a connected graph G, as an important distance-based topological index, the Harary index H(G) is defined as the sum of the reciprocals of the distance between any two unordered vertices of the graph G. In this book, the authors report on the newest results on the Harary index of a graph. These results mainly concern external graphs with respect to the Harary index; the relations to other topological indices; its properties and applications to pure graph theory and chemical graph theory; and two significant variants, i.e., additively and multiplicatively weighted Harary indices. In the last chapter, we present a number o...
Mechatronic modeling and simulation using bond graphs
Das, Shuvra
2009-01-01
Introduction to Mechatronics and System ModelingWhat Is Mechatronics?What Is a System and Why Model Systems?Mathematical Modeling Techniques Used in PracticeSoftwareBond Graphs: What Are They?Engineering SystemsPortsGeneralized VariablesBond GraphsBasic Components in SystemsA Brief Note about Bond Graph Power DirectionsSummary of Bond Direction RulesDrawing Bond Graphs for Simple Systems: Electrical and MechanicalSimplification Rules for Junction StructureDrawing Bond Graphs for Electrical SystemsDrawing Bond Graphs for Mechanical SystemsCausalityDrawing Bond Graphs for Hydraulic and Electronic Components and SystemsSome Basic Properties and Concepts for FluidsBond Graph Model of Hydraulic SystemsElectronic SystemsDeriving System Equations from Bond GraphsSystem VariablesDeriving System EquationsTackling Differential CausalityAlgebraic LoopsSolution of Model Equations and Their InterpretationZeroth Order SystemsFirst Order SystemsSecond Order SystemTransfer Functions and Frequency ResponsesNumerical Solution ...
An algebraic approach to graph codes
DEFF Research Database (Denmark)
Pinero, Fernando
This thesis consists of six chapters. The first chapter, contains a short introduction to coding theory in which we explain the coding theory concepts we use. In the second chapter, we present the required theory for evaluation codes and also give an example of some fundamental codes in coding...... theory as evaluation codes. Chapter three consists of the introduction to graph based codes, such as Tanner codes and graph codes. In Chapter four, we compute the dimension of some graph based codes with a result combining graph based codes and subfield subcodes. Moreover, some codes in chapter four...
DEFF Research Database (Denmark)
Jensen, T.R.; Thomassen, Carsten
2000-01-01
If k is a prime power, and G is a graph with n vertices, then a k-coloring of G may be considered as a vector in GF(k)(n). We prove that the subspace of GF(3)(n) spanned by all 3-colorings of a planar triangle-free graph with n vertices has dimension n. In particular, any such graph has at least n...... - 1 nonequivalent 3-colorings, and the addition of any edge or any vertex of degree 3 results in a 3-colorable graph. (C) 2000 John Wiley & Sons, Inc....
Text-Filled Stacked Area Graphs
DEFF Research Database (Denmark)
Kraus, Martin
2011-01-01
-filled stacked area graphs; i.e., graphs that feature stacked areas that are filled with small-typed text. Since these graphs allow for computing the text layout automatically, it is possible to include large amounts of textual detail with very little effort. We discuss the most important challenges and some...... solutions for the design of text-filled stacked area graphs with the help of an exemplary visualization of the genres, publication years, and titles of a database of several thousand PC games....
Reconstructing Nearly Simple Polytopes from their Graph
Doolittle, Joseph
2017-01-01
We present a partial description of which polytopes are reconstructible from their graphs. This is an extension of work by Blind and Mani (1987) and Kalai (1988), which showed that simple polytopes can be reconstructed from their graphs. In particular, we introduce a notion of $h$-nearly simple and prove that 1-nearly simple and 2-nearly simple polytopes are reconstructible from their graphs. We also give an example of a 3-nearly simple polytope which is not reconstructible from its graph. Fu...
A Reduction of the Graph Reconstruction Conjecture
Directory of Open Access Journals (Sweden)
Monikandan S.
2014-08-01
Full Text Available A graph is said to be reconstructible if it is determined up to isomor- phism from the collection of all its one-vertex deleted unlabeled subgraphs. Reconstruction Conjecture (RC asserts that all graphs on at least three vertices are reconstructible. In this paper, we prove that interval-regular graphs and some new classes of graphs are reconstructible and show that RC is true if and only if all non-geodetic and non-interval-regular blocks G with diam(G = 2 or diam(Ḡ = diam(G = 3 are reconstructible
Total dominator chromatic number of a graph
Directory of Open Access Journals (Sweden)
Adel P. Kazemi
2015-06-01
Full Text Available Given a graph $G$, the total dominator coloring problem seeks a proper coloring of $G$ with the additional property that every vertex in the graph is adjacent to all vertices of a color class. We seek to minimize the number of color classes. We initiate to study this problem on several classes of graphs, as well as finding general bounds and characterizations. We also compare the total dominator chromatic number of a graph with the chromatic number and the total domination number of it.
Equitable Colorings Of Corona Multiproducts Of Graphs
Directory of Open Access Journals (Sweden)
Furmánczyk Hanna
2017-11-01
Full Text Available A graph is equitably k-colorable if its vertices can be partitioned into k independent sets in such a way that the numbers of vertices in any two sets differ by at most one. The smallest k for which such a coloring exists is known as the equitable chromatic number of G and denoted by =(G. It is known that the problem of computation of =(G is NP-hard in general and remains so for corona graphs. In this paper we consider the same model of coloring in the case of corona multiproducts of graphs. In particular, we obtain some results regarding the equitable chromatic number for the l-corona product G ◦l H, where G is an equitably 3- or 4-colorable graph and H is an r-partite graph, a cycle or a complete graph. Our proofs are mostly constructive in that they lead to polynomial algorithms for equitable coloring of such graph products provided that there is given an equitable coloring of G. Moreover, we confirm the Equitable Coloring Conjecture for corona products of such graphs. This paper extends the results from [H. Furmánczyk, K. Kaliraj, M. Kubale and V.J. Vivin, Equitable coloring of corona products of graphs, Adv. Appl. Discrete Math. 11 (2013 103–120].
VT Digital Line Graph Miscellaneous Transmission Lines
Vermont Center for Geographic Information — (Link to Metadata) This datalayer is comprised of Miscellaineous Transmission Lines. Digital line graph (DLG) data are digital representations of cartographic...
Voting-based consensus clustering for combining multiple clusterings of chemical structures
Directory of Open Access Journals (Sweden)
Saeed Faisal
2012-12-01
Full Text Available Abstract Background Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results The cumulative voting-based aggregation algorithm (CVAA, cluster-based similarity partitioning algorithm (CSPA and hyper-graph partitioning algorithm (HGPA were examined. The F-measure and Quality Partition Index method (QPI were used to evaluate the clusterings and the results were compared to the Ward’s clustering method. The MDL Drug Data Report (MDDR dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward’s method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward’s method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA was the method of choice among consensus clustering methods.
Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory
Pearce, Roger
2013-05-01
We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). © 2013 IEEE.